ALL TOGETHER NOW 


Paris climate talks bring high hopes for 
anew globalagreement metazs 


pace al trust 


SUSTAMAGUTY | | 2 eee 
SPACESHIP «ULTRASOUND. | ANTIBODY iuin 
EARTH INTHEBLOOD CONFUSION 
Fiori utpnpniod = Super reactulion view oy Ths scurck for cotilealies 


our planctary licwhs circulation in the rut Groin 
Putt 3 | LOGS 4518 oh 


THIS WEEK 


BRAZIL Legal battle over 
purported cancer drug tests 
limits of ‘right to try’ p.410 


EDITORIALS 


WORLD VIEW No need for 
a perfect solution 
from Paris talks p.411 


CONSERVATION Mallorcan 
toad species cured of 
fungal infection p.412 


pa ae 


The way forward is through Paris 


Leaders must come together on a solid agreement at the United Nations climate conference — and 
then get to work at home by meeting commitments and finding new ways to reduce emissions. 


a brief appearance at the last major global climate summit. The 

2009 Copenhagen negotiations descended into an angry free-for- 
all, although one basic idea was agreed: that countries, rich and poor, 
need to step forward with their own climate solutions. This idea stuck 
and is now at the heart of the negotiations going into the United Nations 
Paris Climate Conference, where countries will attempt to forge the 
first ever fully fledged international climate agreement. Nature offers 
a package of stories and commentaries this week (see nature.com/ 
parisclimate) previewing what many expect to be the biggest step so far 
towards controlling global greenhouse-gas emissions. 

That optimism should not be taken as a sign that all is well. Last year 
was the warmest on record. This year will be warmer still, with aver- 
age temperatures expected to reach more than 1 °C above pre-industrial 
levels. An array of impacts are already being documented around the 
globe, including melting ice, decreasing crop yields and shifting animal- 
migration patterns. And yet, despite a quarter of a century of increasingly 
desperate debate, greenhouse-gas emissions continue to rise. 

We know that any deal emerging from the Paris conference will 
not solve the problem. Even if nations follow through on the climate 
pledges that have been made so far, global emissions are projected to 
rise until at least 2030, and temperatures could reach 2°C above pre- 
industrial levels as early as 2032. The UN has set the goal of limiting 
any rise to 2 °C, but even this increase would not protect the world’s 
most vulnerable citizens from rising tides, extreme weather and shift- 
ing precipitation patterns. 


T: world’s leaders left a fabulous mess in their wake after making 


PLETHORA OF PLEDGES 
But there are reasons for optimism. Foremost is the fact that a solid 
majority of nations, accounting for roughly 91% of global emissions, 
have submitted climate pledges. Many, including those ofall developed 
countries, feature commitments to curb greenhouse-gas emissions. 
Others, from a plethora of developing countries, focus on sustainable 
development and adaptation to the impacts of rising temperatures. Even 
with financial and technological aid, emissions will continue to rise in 
these countries as governments seek to lift their people out of poverty. 
All told, the world’s pledges fall short. But for the first time, govern- 
ments are moving forward collectively; as David Victor and James 
Leape point out in their Comment on page 439, that is the first step. 
Although many countries want to make these commitments binding 
under international law, they will remain voluntary, at least for now. The 
US Senate’s aversion to international treaties is often blamed, but many 
countries worry about binding commitments given the difficulty of the 


real consequences for those that did not live up to their obligations. 
The focus now is on building a ‘pledge-and-review system that 
pushes countries to submit their own national commitments, which 
are then up for review by other governments and groups. There is 
some evidence that this ‘institutionalized peer pressure’ can work: 
175 countries have voluntarily submitted pledges so far. 
Economic and political momentum is building. Renewable energy 
is growing faster than anybody projected just a few years ago. The 
consultancy Bloomberg New Energy Finance 


“World leaders has projected that renewables will account 
must come for two-thirds of the US$12 trillion that will 
together and be invested in electricity generation over the 
signal the way next 25 years. Brazil has made huge progress 
forwar a in reducing deforestation, and the palm-oil 


industry has committed to reduce deforesta- 
tion in Indonesia and other countries. The countries of the Organi- 
sation for Economic Co-operation and Development agreed on 18 
November to restrict financing for coal-fired power plants, and the 
United Kingdom is weighing up a proposal to shut down all of its coal 
plants by 2025. In the United States, coal is on the ropes thanks to a 
combination of regulation and cheap natural gas. 

In Paris, negotiators must provide a strong framework for reporting 
and verifying climate pledges. Governments, scientists and environ- 
mentalists need solid information about who is doing what. And the 
agreement should require a five-year review process so that govern- 
ments can identify ways to go even further at the next major climate 
summit in 2020. Once everybody is pointed in the right direction, the 
hope is that human ingenuity will kick in, and the world will discover 
ways to reduce emissions more quickly. 

As reported in our News Feature on page 436, however, limiting the 
temperature rise to 2°C will be difficult. Barring premature retirement 
of much of the existing fossil-fuel infrastructure, the only way to get 
there will be to overshoot the target and then bring atmospheric carbon 
dioxide concentrations back down later in the century. Unless engineers 
figure out a simple way to pull CO, out of the atmosphere, this probably 
means deploying bioenergy at massive scales, capturing the CO, that is 
emitted during energy production and pumping it underground. 

One day, governments may decide that measures such as extreme 
decarbonization are necessary. In the meantime, scientists must inves- 
tigate the social, political and economic realities ahead and research 
the consequences of rising emissions, including potentially cata- 
strophic shifts in the climate system. 

In Paris next week, world leaders must come together and signal the 


economic transition that is required. The 1997 
Kyoto Protocol included binding commitments j 

from most developed nations — notably exclud- | 
ing the United States — but many developed 
countries received a free pass. And there were no 


I™ 


PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


way forward for their governments, their citi- 
zens and for businesses and investors. If humans 
want to keep living on a planet that looks, feels 
and functions like the one we live on now, it is 
time to sign an agreement and get to work. m 


© 2015 Macmillan Publishers Limited. All rights reserved 


26 NOVEMBER 2015 | VOL 527 | NATURE | 409 


Built on trust 


Written agreements between parties in research 
collaborations are not a sign of alack of faith. 


have been avoided if the parties involved had hammered out 
the details of their collaboration beforehand. 

On 16 November, researchers at Peking University in Beijing 
claimed discovery of a biological-compass mechanism that could 
explain how some animals sense magnetism (S. Qin et al. Nature Mater. 
http://doi.org/89v; 2015). But some of the paper’s thunder was stolen 
by a researcher at Tsinghua University, also in Beijing, who reported 
in September how the same mechanism could be used to manipulate 
neurons in worms (X. Long et al. Sci. Bull. http://doi.org/883; 2015). 

When the September paper was published, the lead Peking 
University researcher cried foul, claiming that his Tsinghua colleague 
had agreed not to publish until the Nature Materials paper came out 
(see Nature http://doi.org/9gg; 2015). University administrators got 
involved, the Tsinghua researcher was fired, and his graduate stu- 
dent, whose career has been upended, circulated a plea for support to 
China’s scientific community. The Peking researcher has called for his 
rival's paper to be retracted. Both parties have mustered e-mails and 
other correspondence to show that the facts are on their side. 

A detailed, formalized agreement could have prevented this. When 
embarking on a collaboration, it can be hard to ask a scientific peer 
to sign a contract. Lawyers get involved, making it cumbersome and 
costly. Fencing off rights to patents, authorship, publication and 
decision-making authority can be tedious and can cause tension. A 


A scuffle that has riled the Chinese scientific community could 


simple handshake is much more comfortable. 

This is true for researchers around the world. But in China, where 
people are finely tuned to what might make them lose face, the bar 
is especially high. Asking someone to sign such an agreement feels 
equivalent to saying that you don’t trust them. 

A survey of Chinese researchers undertaken by Nature Publishing 
Group supports that observation. Scientists who had worked abroad 

were asked about the differences in the work- 


“The bigges t ing environment in China compared with 
hindrance to that in other countries, including the ease 
harmonious of carrying out collaborations. Some noted 
collaboration that Chinese researchers usually do not ask 
was tension over _ for formal agreements. The reason might be 


cultural, but it could also be that most univer- 
sities and research organizations in China do 
not have the personnel to support this function. 

The survey results appear in a 26 November report, Turning Point: 
Chinese Science in Transition (see go.nature.com/ybsatt and go.nature. 
com/fdwacj; in Chinese). The biggest hindrance to harmonious 
collaboration, according to interviewees, was tension over author- 
ship — a factor that plays a substantial part in the dispute over the 
biological-compass papers. In China, assessors of a researcher’s 
achievements focus on papers in which the individual is first or corre- 
sponding author. The report suggests that research assessment should 
take a more balanced approach, and that policymakers can iron out 
some of these wrinkles. 

It is clear that university administrators can help collaborations 
by providing personnel to deal with the legal aspects. It might be 
a burden in the short term, but in the long term it would encour- 
age collaboration. Scientists with valuable knowledge who want to 
protect their rights to priority in publication, patents and other areas 
deserve as much. = 


authorship.” 


Drugs on demand 


Controversy in Brazil over access to a purported 
cancer cure could set a harmful precedent. 


university against hundreds of cancer patients who want 
access to a compound that some have branded a miracle cure. 

But whether the compound holds any benefits at all remains to be 
seen: it has never been evaluated in human trials. The conflict is an 
extreme version of a debate that has gone on in the United States and 
elsewhere, as terminally ill people whose diseases have withstood 
modern medicine’s proved arsenal have demanded access to untested 
treatments. 

As we report on page 420, courts in Brazil have previously 
sympathized with those demands, ordering the University of Sao Paulo 
to provide a compound called phosphoethanolamine to hundreds of 
patients. People on both sides of this debate are armed with good inten- 
tions. The university argues that the drug is untested, and should not 
be used to give false hope — and unknown side effects — to vulnerable 
patients. On the other side, it is understandable that people with little 
hope may prefer the uncertainty of an untested drug to the certainty of 
a terminal illness. 

But there are also concerning reports that some people with 
cancer are not taking their prescribed medications, for fear that 
scientifically proven medicine may interfere with the supposed 
miracle of phosphoethanolamine. The tenor of the debate has also 
been harmful at times, with some phosphoethanolamine advocates 
accusing the government or the pharmaceutical industry of actively 


A furious debate that is raging in Brazil pits the nation’s largest 


410 | NATURE | VOL 527 | 26 NOVEMBER 2015 


suppressing further development of the drug. 

The sad truth is that the drug is unlikely to be a miracle. In the 
United States, for example, only one in ten drugs that make it to 
phase I clinical trials are destined to gain approval from the US Food 
and Drug Administration (FDA). And phosphoethanolamine has not 
even made it that far: its promise is backed up by a few publications 
based on lab and animal tests. 

Even so, terminally ill patients may be willing to try a treatment 
with only the slimmest odds of success. In the United States, several 
states have passed laws that, to varying degrees, grant such patients 
the right to try experimental drugs outside the purview of the FDA. 
The laws have triggered debates of their own, and have come under 
fire for offering false hope and for potentially leading patients away 
from other, more promising avenues. 

The situation in Brazil is more extreme. A university laboratory is 
neither a pharmaceutical plant nor a pharmacy; it is not required to 
follow good manufacturing protocols. There is no oversight to certify 
what is going into the blue-and-white phosphoethanolamine capsules 
produced at the University of Sao Paulo. Neither the compound’s side 
effects nor its efficacy are systematically monitored. To order a uni- 
versity to supply a drug is to show a disregard for the importance of 
all these safety measures. 

The hope of phosphoethanolamine lies in further research. Federal 
funders in Brazil have said that they will support further preclini- 
cal studies of the drug. Researchers are pursuing options for moving 
the compound into clinical trials, should those animal studies suc- 
ceed; patients who are interested in pursuing phosphoethanolamine 
treatment could enrol in the clinical tests. In the 
meantime, the courts should liberate patients 
from the legal tug-of-war and uphold the latest 
decision to halt distribution of phosphoethan- 
olamine until its potential is better understood. m 


> NATURE.COM 

To comment online, 
click on Editorials at: 
go.nature.com/xhunqv 


© 2015 Macmillan Publishers Limited. All rights reserved 


M. AXELSSON/AZOTE 


WORLD VIEW  jennisiconon 


first attempt to agree on decisive action to avoid what the United 
Nations defines as dangerous climate change. 

The climate negotiations have set this danger threshold at 
1.5-2 °C of global warming above pre-industrial levels. With such 
a guard rail established, the required components of a ‘successful’ 
climate deal more or less fall into place. A reasonable chance of 
attaining 2 °C translates to a finite global carbon budget of about 
900 gigatonnes of carbon dioxide from 2015 onward that must be 
shared in a fair way between all nations. 

Can and should the Paris talks deliver an agreement that gives a 
binding commitment from all nations to meet this outcome? The last 
time the world gathered for a decisive global agreement on climate 
change, in Copenhagen in 2009, the remit was 
that, yes, world leaders needed to do nothing less 
than decide on a global, legally binding agree- 
ment that met the scientific targets of a safe and 
just future below 2°C. 

But since Copenhagen, the global discourse 
has changed. In 2009, it was possible to show 
convincingly only that we needed to tackle 
the climate challenge; it was not easy to show 
that it was possible. Today, the need is more 
apparent than ever. And, more importantly, 
there is ample evidence that scaling up eco- 
nomically competitive, clean-energy solutions 
is possible. 

Before Copenhagen, economists generally 
thought that a high oil price was the best way 
to enable a transition to a decarbonized future. 
The surprising reality is that low oil prices 
seem to be the most effective way of ensuring a transition away from 
fossil fuels. Renewable energy systems compete even at low oil prices, 
which in turn closes the door on unconventional, expensive oil, such 
as offshore oil and exploitation in difficult environments such as the 
Arctic. It also opens a unique window to introducing a global price on 
carbon — clearly the most effective policy measure for accelerating 
the transition to fossil-fuel-free energy. 

Experience across industrial sectors shows that new solutions can 
scale up and become part of the mainstream in markets and societies 
only once they have penetrated at least 15-20% of the marketplace or 
society. For renewable energy, this penetration has been achieved in 
enough countries only in the past three to four years. 

In this new situation, is it possible to envisage a transformation to 
a decarbonized world by around 2050 even if 


S o here we go again. Nations are meeting in Paris for their twenty- 


WE NEED AN 


AGREEMENT 


THAT IS 


TO TIP THE WORLD 


DECISIVELY 


TOWARDS RAPID 
DECARBONIZATION. 


A ‘perfect’ agreement in 
Paris is not essential 


Success at the latest climate talks will be a recognition by the world’s nations 
that incremental change will not do the job, says Johan Rockstrém. 


incremental change, but rather ‘the assurance that the world is serious 
about a transformation. We need an agreement that is good enough to 
tip the world decisively towards rapid decarbonization. A new treaty 
does not need to force nations into compliance, but rather should 
create confidence and send the right signal — to investors, businesses 
and societies at large — that the global political leadership is turning 
irrevocably towards a new sustainable era. 

How ambitious must the Paris agreement be to decisively support 
such a trajectory? To meet the 2°C limit, the world must cut carbon 
emissions at about 6% per year. National pledges on the table at Paris 
will not get us close. From experience, we know that emissions cuts in 
the range of 0-2% per year are within the realm of incremental policy 
measures. A range of 2-3% requires ambitious adaptation. Once levels 
exceeding 3-4% are reached, experience indi- 
cates that radical measures are needed, such as 
carbon taxes and the phasing out of coal power. 

These are the kinds of changes needed to 
decarbonize the world economy, and above all, 
to send clear signals of a shift from incremental 
to transformative change. Success in Paris should 
thus be viewed as an agreement that corresponds 
to a pace of emissions cuts of greater than 3-4% 
per year, starting in the 2015-20 window. 

In turn, this would suggest that Paris must 
accumulate 80% of the national pledges needed 
to stay within the 2°C guard rail, with at least 
20% of the countries committing to more 
than 4% average cuts per year, to create a large 
enough critical mass of nations committed to 
decarbonization and to influence the global logic 
(see go.nature.com/luxlyn). Achieving this goal 
is ambitious but realistic. And it comes with a decent chance that, 
once nations realize the benefits of decarbonization, they will 
increase their pledges. It is crucial, therefore, that the Paris agreement 
allows for recurrent recalibration of the pledges, at least every third 
or fifth year. 

It would be dangerous to allow ‘success’ to be reduced to a low 
level of political achievement so that the world continues along an 
incremental policy path that stands no chance of supporting a tran- 
sition to decarbonization. Equally, scientists can no longer dismiss 
as failure an agreement that is not fully in line with the demands 
of climate science. For if Paris is widely perceived to have failed, 
political leadership is likely once again to enter a post-Copenhagen 
climate trauma and instead focus on other more urgent (and politically 
rewarding) issues. m 


Paris does not deliver the ‘perfect’ agreement? j 
The answer is yes. To get there, the threshold \ 
for success in Paris should not be at the level 
of ‘resolving the climate problem’ through 


r™ 


PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


Johan Rockstrém is chair of the Earth League 
and director of the Stockholm Resilience Centre. 
e-mail: johan.rockstrom@su.se 


© 2015 Macmillan Publishers Limited. All rights reserved 


26 NOVEMBER 2015 | VOL 527 | NATURE | 411 


RESEARCH HIGHLIGHTS 


METABOLISM 


Gastric surgery 
alters sweet tooth 


Some weight-loss surgeries 
can diminish cravings 
for sweets by altering the 
brain’s response to the 
neurotransmitter dopamine. 
Ivan de Araujo of Yale 
University in New Haven, 
Connecticut, and his 
colleagues studied the effects 
of a duodenal-jejunal bypass, 
which reroutes food from 
the stomach directly into 
the middle part of the small 
intestine. They found that 
well-fed mice that did not have 
the surgery consumed more 
sugar after previous repeated 
exposure to sweets. Mice 
that had the surgery did not 
develop the same sweet tooth. 
Sugar consumption led 
to the release of dopamine, 
which is involved in reward 
responses, particularly when 
the sugar was administered 
to the upper region of the 
intestines (the area bypassed 
in the surgery). Activating 
dopamine-sensing neurons 
restored the sweet cravings in 
mice that had undergone the 
surgery. 
Cell Metab. http://doi.org/9dm 
(2015) 


Toads saved from 
killer fungus 


Biologists have rid a wild 
toad species of a lethal 
fungal disease that threatens 


Selections from the 
scientific literature 


ZOOLOGY 


Mollusc sees with its shell 


A marine mollusc has hundreds of eyes in its 


armour that can see images. 


Christine Ortiz at the Massachusetts 
Institute of Technology in Cambridge and 
her colleagues studied the structural, optical 
and mechanical properties of the eyes of 
Acanthopleura granulata (pictured) using 
various experimental and computational 
techniques. Unlike in most animals, the 
microscopic lenses are not organic, but 
are made of the mineral aragonite. These 


amphibians around the world. 
The chytrid fungus 
Batrachochytrium dendrobatidis 
has wiped out many species of 
frogs and toads. Jaime Bosch 
at Spain’s National Museum of 
Natural History in Madrid and 
his team removed tadpoles 
of the midwife toad (Alytes 
muletensis; pictured) from 
ponds on the Spanish island 
of Mallorca and treated them 
in the lab with a drug that kills 
the fungus. They also drained 
the ponds and sprayed 
them with a disinfectant 
before returning the 
tadpoles. The fungus 
disappeared in four 
out of five treated 


412 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


minimize light scattering because they are 


made of large and aligned crystals. Projecting 


2 metres away. 


the team found. 


ponds for two years. 

The method may work only 
in some habitats, the authors 
say. 

Biol. Lett. 11, 20150874 (2015) 


Snow-fed water 
supply threatened 


The southwestern United 
States, the Iberian Peninsula 
and parts of the Middle East 
and other regions are at risk 
of seasonal water shortages 
resulting from decreasing 
snowfall in a warming 
climate. 

Justin Mankin at Columbia 


images through the lenses showed that they 
could resolve an image of a potential predator 
of around 20 centimetres in size from about 


The shells are much weaker at these points 
than elsewhere, but the organism has evolved 
ways to compensate for the structural weakness, 


Science 350, 952-956 (2015) 


University in New York 

and his colleagues looked 

at projections from various 
climate models to determine 
how warming might affect 
snowfall and river run-off in 
more than 400 large basins 
in the Northern Hemisphere. 
The team identified a dozen 
or so snow-sensitive basins 
that, across all climate 
models, face an 80-100% risk 
of declining water supply in 
the coming decades. Each 

of the sensitive basins has a 
current population of more 
than 1 million people — 
including the Rio Grande 
basin spanning Texas and 
Mexico, the Ebro—Duero 


ALAN CRESSLER 


CHRIS MATTISON/NATUREPL.COM 


NASA 


basin in Spain and the Asi 
basin in Lebanon and Syria. 
Environ. Res, Lett. 10,114016 
(2015) 


NUTRITION 


Personalized 
diets for health 


People who eat identical meals 
display different blood glucose 
levels afterwards, thanks in 
part to differences in their gut 
microbes. 

Large spikes in blood glucose 
after eating increase the risk 
of type 2 diabetes, so dietary 
guidelines rank foods based 
on their glycaemic index — an 
indicator of their effects on 
blood glucose. Eran Elinav and 
Eran Segal of the Weizmann 
Institute of Science in Rehovot, 
Israel, and their colleagues 
continuously monitored 
the diets and lifestyles of 
800 people over a week, and 
found that meals with the 
same glycaemic index caused 
widely different glucose levels 
in participants. By analysing 
data on the participants’ gut 
microbiomes, physical activity 
and other clinical factors, the 
team created personalized diets 
for 26 people and found that 
these resulted in lower glucose 
levels after meals than did 
non-personalized diets. 

The study could partly 
explain the limited efficacy of 
universal dietary guidelines, 
the authors say. 

Cell 163, 1079-1094 (2015) 


Lasers reveal 
quantum jitters 


Ultrafast laser pulses can be 
used to detect the motion of 
a single atom, from energetic 
wiggles to quantum jitters. 
Kale Johnson at the 
University of Maryland 
in College Park and his 
colleagues trapped ions 
of ytterbium and zapped 
them with laser pulses 
just 10 picoseconds long. 
The pulses gave the atom 
small kicks in momentum 
of different magnitudes, 
depending on its internal 


state. This resulted ina 
new state that encoded the 
atom’s original motion. After 
another sequence of pulses, 
the researchers observed 
fluorescent light from the 
atom that allowed them to 
measure its quantum motion. 
The technique could be 
useful for future quantum 
computers built from trapped 
ions, the team says. 
Phys. Rev. Lett. 115, 213001 
(2015) 


Flower given 
digital power 


Researchers have incorporated 
electronic circuitry into the 
tissues of a rose. 

Magnus Berggren at 
Linképing University in 
Norrk6ping, Sweden, and 
his colleagues submerged 
the cut end ofa rose stem 
into a water-based solution 
of PEDOT, a conducting 
polymer that is used in 
printable electronics. 
Capillary action pulled the 
polymer up into the rose’s 
vascular tissue, where it came 
out of solution and self- 
assembled into wires, some 
as long as 10 centimetres. By 
attaching gold probes coated 
with PEDOT to the wires, the 
researchers made individual 
transistors and demonstrated 
a simple digital circuit. 

The transistors’ electrical 
performance was on a par 
with that of conventional 
printed PEDOT circuits. 

The technology could 
eventually be used to record or 
regulate plant physiology, the 
authors say. 

Sci. Adv. 1, e1501136 (2015) 


AGRICULTURAL ECOLOGY 


Complex effects of 
pesticides on bees 


Honeybee colonies could be 
compensating for the harmful 
effects of certain pesticides by 
producing more workers, at 
least in the short term. 

Some European countries 
banned neonicotinoid 
pesticides in 2013, but this 


RESEARCH HIGHLIGHTS Mii Saiaa¢ 


SOCIAL SELECTION 


Popular topics 
on social media 


Text-mining block prompts response 


A scientist who mines the text of research publications 
was blocked by the scientific publisher Elsevier from 
downloading large numbers of its papers — a move that 
he described in a blog post that was shared by many on 
social media. Chris Hartgerink, a statistician at Tilburg 
University in the Netherlands, says that the publisher is 
hindering his research. Elsevier allows text-mining through 
the use ofa specific application programming interface, 
and says that this prevents its website from being slowed 
down by researchers who download large amounts of data. 
Frank Huysmans, a library scientist at the University of 
Amsterdam, linked to the blog post on 


> NATURE.COM 
For more on 

popular papers: 
go.nature.com/pkmi9d 


remains controversial because 
field studies have failed to 
confirm the adverse effects 
reported for bees in the 

lab. Mickaél Henry at the 
French National Institute 

of Agricultural Research in 
Avignon and his colleagues 
positioned honeybee 

colonies in farmers’ fields 

so that they were exposed to 
varying levels of the pesticide 
thiamethoxam. The team 
radio-tagged and monitored 
nearly 7,000 bees, and found 
that pesticide exposure caused 
an acceleration in death rate 
over time. 

The colonies, however, 
compensated for dead 
foragers by producing 
more workers and fewer 
drones. This maintains 
honey production but could 
decrease bee reproduction 
in the long term. The risks of 
pesticides in the field may be 
best understood by studying 
entire colony cycles, the 
authors say. 

Proc. R. Soc. B 282, 20152110 
(2015) 


Martian moon will 
break apart 


Phobos, one of Mars’s two 
moons, will disintegrate some 
20 million to 40 million years 


Twitter: “How signing away copyright to 
academic publishers obstructs content 
mining research ... Strong case for 
#openaccess #tdm.” 


from now, and its particles will 
form the only planetary ring in 
the inner Solar System. 
Benjamin Black and Tushar 
Mittal of the University of 
California, Berkeley, made 
these predictions by analysing 
tidal and other forces that 
are currently pulling Phobos 
(pictured) towards Mars. 
Using a geological model 
of how rock holds together, 
they calculated that the moon 
would rip apart before it 
smashed into the planet. The 
resulting ring would be stable 
for 1 million to 100 million 
years, they say. 
Nature Geosci. http://dx.doi. 
org/10.1038/nge02583 (2015) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 


26 NOVEMBER 2015 | VOL 527 | NATURE | 413 
© 2015 Macmillan Publishers Limited. All rights reserved 


SEVEN DAYS 


EVENTS 


Whalers fined 


An Australian court fined 

a Japanese company 

Aus$1 million (US$724,000) 
on 18 November and found 
the firm to be in contempt of 
court for killing minke whales 
in an area declared a sanctuary 
by Australia. According 

to the animal-protection 
organization Humane Society 
International (HSI), which, 
along with the Environmental 
Defender’s Office, brought the 
case against the firm Kyodo 
Senpaku Kaisha, this is one of 
the largest fines issued under 
Australian conservation law. 
The company caught whales 
in four different years, despite 
a 2008 injunction against the 
practice, says the HSI. 


Climate repeals 

The US Senate voted on 

17 November to repeal a 

pair of regulations by the 
Environmental Protection 
Agency that would limit 
carbon emissions from new 
and existing power plants. 
Votes on both rules were led by 
Republicans and passed by a 
margin of 52-46; the House of 
Representatives is considering 
similar resolutions. Coming 
just two weeks before the 
United Nations climate summit 
in Paris, the resolutions are 
largely symbolic. US President 
Barack Obama promised 

to veto both repeals, and 
supporters do not have the 
two-thirds majority needed to 
override a veto. 


Statoil Arctic exit 


Norwegian energy company 
Statoil announced on 

17 November that it would 
cease exploration for gas and 
oil in Alaska’s Chukchi Sea. 
The decision comes just over a 
month after Royal Dutch Shell 
suspended its own exploration 
off the Alaskan coast, citing 
regulatory uncertainty anda 


The news in brief 


Tasmanian devils returned to the wild 


Tasmania has 39 more wild devils, after the latest 
batch of healthy individuals was released from 
the Devils Ark Santuary (pictured is manager 
Dean Reid) onto the Forestier Peninsula on 

18 November. The area was cleared of Tasmanian 
devils (Sarcophilus harrisii) after an infectious 


disappointing survey of the 
area’ fossil-fuel prospects. 

The Statoil decision sees the 
company exit early from 

16 leases that were set to expire 
in 2020. 


L'Aquila verdict 
Italy’s highest court of 

appeal on 20 November 
upheld a decision to acquit 

6 seismologists accused of 
manslaughter in regard to the 
2009 LAquila earthquake, 
which killed more than 

300 people. Prosecutors 
claimed that the scientists 
misled townspeople about the 
risk, leading them to stay in 
their homes instead of seeking 
safety. The scientists were 
originally given six-year prison 
sentences, but an appeals court 
in LAquila acquitted them 

last November, and reduced 


414 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


to two years the sentence of 
Bernardo De Bernardinis, 
former deputy director of 

the Italian Civil Protection 
Department, who was also 
convicted. De Bernardinis’s 
reduced sentence was upheld; 
he still faces a separate charge 
of manslaughter. 


Ebola setback 


Ina setback to efforts to end the 
Ebola epidemic in West Africa, 
the World Health Organization 
announced three new cases 

of the disease in Liberia on 

20 November. One of those 
individuals, a 15-year-old boy, 
died on 23 November. The 
country had been declared 
Ebola-free on 3 September. 
Sierra Leone was declared 
Ebola-free on 7 November, 

and the last case in Guinea 

was reported on 29 October, 


cancer that is devastating populations of the 
endangered animals was detected there in 2004. 
A ‘devil-proof fence’ has now been installed to 
prevent the new, healthy population from mixing 
with animals afflicted with the deadly and 
infectious devil facial tumour disease. 


leading to hopes that the 
epidemic, which began in 
December 2013, might finally 
be nearing an end. 


Pandemic report 

A panel of physicians, 
scientists and policy 

experts has called for major 
reforms to the World Health 
Organization and other 
international health-response 
systems following the Ebola 
epidemic that has killed more 
than 11,000 people. The 
panel, convened by Harvard 
University in Cambridge, 
Massachusetts, and the 
London School of Hygiene and 
Tropical Medicine, released 
its report on 22 November 

(S. Moon et. al. Lancet http:// 
doi.org/9gf; 2015). It also 
recommends measures to 
improve prevention, detection 


JASON REED/REUTERS 


# and response to outbreaks, and 
& to speed research on diseases 

= that cause them. See go.nature. 
com/jxxvs6 for more. 


PHER GRUNE! 


Rare rhino dies 

§ Northern white rhinoceroses 
© (Ceratotherium simum 
cottoni) are one step closer to 
extinction, after a 41-year-old 
female named Nola had to be 
put down after surgery at the 
San Diego Zoo Safari Park in 
California on 22 November. 
The last three remaining 
individuals — two females 
that cannot reproduce 
naturally and a male with a 
low sperm count — live at 

Ol Pejeta Conservancy in 
Kenya. Conservationists hope 
that the species can be saved 
through assisted reproduction 
techniques, using southern 
white rhinos (Ceratotherium 
simum simum) as surrogates. 


POLICY 


CRISPR cress 

The Swedish Board of 
Agriculture on 17 November 
told two Swedish universities 
that they do not need special 
approval for field trials of 
some cress (Arabidopsis, 
pictured) varieties mutated 
by the CRISPR-Cas9 gene- 
editing technique. In June, 
the European Commission 
had asked European Union 
member states to hold back 
on such rulings until it makes 
its own proposals on how to 


drugs for HIV has increased 


TREND WATCH 


The availability of antiretroviral 


SOURCE: J. BOR ETAL. PLOS MED. 12, E1001905 (2015) 


women's lifespans more than 
men’s in KwaZulu Natal in South 
Africa, concludes a study of more 
than 98,000 people (J. Bor et al. 
PLoS Med. 12, €1001905; 2015). 


Since free antiretroviral treatment 


became available in South 
Africa in 2004, declines in life 


expectancy have reversed for both 


genders. But progress is uneven, 
with women gaining more years 
of life than men. The authors 
recommend that HIV outreach 
activities be targeted to men. 


regulate organisms modified 
by new genetic techniques. 
But the Swedish authority 
said decisions needed to be 
made now, so that trials can be 
prepared for the next growing 
season. 


Chimps retired 

The US National Institutes 
of Health (NIH) is ceasing 

its chimpanzee-research 
programme altogether, two 
years after retiring most of 
its chimps. In a 16 November 
e-mail to the agency's 
administrators, NIH director 
Francis Collins announced 
that the 50 NIH-owned 
animals that remain available 
for research will be sent to 
sanctuaries. The agency will 
also develop a plan to phase 
out NIH support for the 
remaining chimps that are 
supported, but not owned, 
by the NIH. See page 422 for 
more. 


Coal curbs 

The Organisation for 
Economic Co-operation 
and Development agreed 


on 18 November to restrict 
public financing for coal-fired 
power plants. Two years in 
the making, the agreement 
removes support for large, 
low-efficiency coal-fired 
plants while maintaining 
support for medium-sized, 
high-efficiency plants in 
countries facing energy 
shortages, and for small, less- 
efficient plants in the poorest 
countries. The restrictions 
will not apply to any coal- 
fired plants that are equipped 
to capture and store carbon 
emissions. 


} FUNDING 
UK research review 


A tensely awaited report 
into the future of the major 
UK research-funding 
agencies, released on 

19 November, suggests 

the creation of a powerful 
umbrella organization called 
Research UK to manage the 
agencies. The review was 

led by Nobel-prizewinning 
geneticist Paul Nurse. Many 
scientists feared that it would 


HIV LIFE EXPECTANCIES DIFFER BY GENDER 


Freely available antiretroviral therapy has decreased the mortality 
rate of HIV-positive women more than are HIV-positive men in one 


region of rural South Africa. 


Deaths per 100 person-years 


0) 
2001 2003 2005 


2007 2009 2011 


SEVEN DAYS | THIS WEEK | 


30 NOVEMBER 

The leaders of the 
world’s nations gather 
to broker a climate deal 
at the United Nations 
Paris Climate Change 
Conference. 
nature.com/parisclimate 


1-3 DECEMBER 
Washington DC hosts 
the International Summit 
on Human Gene Editing. 
go.nature.com/huzip3 


2 DECEMBER 

The European Space 
Agency's LISA 
Pathfinder satellite, 
which will hunt for 
gravitational waves, 
launches from Kourou, 
French Guiana. 
go.nature.com/rxrzuc 


recommend a total merger of 
the research councils, which 
collectively distribute some 
£3 billion (US$4.6 billion) of 
government research funding 
each year. Nurse recommends 
that Research UK be led by an 
experienced researcher, who 
would in effect be boss of the 
heads of the seven discipline- 
based councils. See go.nature. 
com/2rwzeu for more. 


| __BUSINESS 
Mega-merger 


Two major pharmaceutical 
firms are to merge ina 
US$160-billion deal, they 
announced on 23 November. 
Pfizer of New York will 
combine with Allergan, based 
in Dublin, in a merger that is 
expected to be completed by 
the end of 2016. The resulting 
firm will be named Pfizer 

but will be headquartered 

in Dublin — providing a 
significant tax break for the 
US firm — and will have more 
than 100 medicines in mid-to- 
late-stage development. 


> NATURE.COM 
For daily news updates see: 
WWwwW.nature.com/news 


26 NOVEMBER 2015 | VOL 527 | NATURE | 415 
© 2015 Macmillan Publishers Limited. All rights reserved 


PAUL DARROW/NYT/REDUX/EYEVINE 


NEWSIN FOCUS 


Climate Court ends Retired research 7 Fat 
optimism builds ahead distribution of unproven chimps preserved in online 
of Paris talks p.418 cancer treatment p.420 database p.422 


25-year quest 

« for aclimate treaty: the 
cA comic book version p.427 
i 


ee 
AquAdvantage Atlantic salmon (at back) grow to twice the size of an normal Atlantic salmon (Salmo salar) over the same time. 


BIOTECHNOLOGY 


Transgenic salmon leaps 
to the dinner table 


Long-awaited decision by US government authorizes the first genetically engineered 


animal to be sold as food. 


BY HEIDI LEDFORD 


breed of fast-growing Atlantic 
A= rocketed to celebrity status on 
19 November when it became the first 
genetically engineered animal to be approved 
for human consumption in the United States. 
The landmark decision by the US Food 
and Drug Administration (FDA) releases the 
‘AquAdvantage’ salmon from two decades of 
regulatory limbo — but it could also revitalize 


an industry that has waited a long time for any 
sign that its products might make it to market. 
“Tt opens up the possibility of harnessing this 
technology,’ says Alison Van Eenennaam, an 
animal geneticist at the University of Califor- 
nia, Davis. “The regulatory roadblock had really 
been disincentivizing the world from using it” 
The FDA decision comes at a time when the 
US government is re-evaluating how it regu- 
lates genetically engineered crops and animals. 
On 2 July, the White House Office of Science 


and Technology Policy said that it will update 
those regulations — for the first time since 
1992 — over the next year. And at a meeting 
on 18 November, the US Department of Agri- 
culture (USDA) discussed preliminary plans to 
revise its guidelines for genetically engineered 
crops. 

A key driving force for these discussions 
is the recognition that current regulations 
may not cover crops and animals engineered 
using cutting-edge techniques, such as the 


26 NOVEMBER 2015 | VOL 527 | NATURE | 417 


© 2015 Macmillan Publishers Limited. All rights reserved 


| NEWS IN FOCUS 


> CRISPR-Cas9 system, that allow researchers 
to make targeted changes to the genome. The 
USDA has already determined that its regula- 
tions do not apply to several genome-edited 
crops. Van Eenennaam says that it is still 
unclear how the FDA will regulate animals that 
have been engineered using that technology. 

“There is a lot going on these days,” says 
Greg Jaffe, director of biotechnology at the 
Center for Science in the Public Interest in 
Washington DC. “But obviously, up until the 
decision about the salmon, people were mostly 
focusing on the crop side” 

AquaBounty Technologies, based in 
Maynard, Massachusetts, filed its first applica- 
tion to the FDA for approval of the salmon in 
1995. The agency completed its food-safety 
assessment in 2010, and released its environ- 
mental-impact statement at the end of 2012. 
The long delay between the completion of 
those steps and a final decision led to rumours 
of political interference. 

But Laura Epstein, a senior policy analyst 
for the FDA’s Center for Veterinary Medicine, 
says that the approval took so long because 
it was the first of its kind. “With most prod- 
ucts that are the first of its kind, we are very 
careful? she says. The agency also had to 
wade through many public comments before 


it could issue a decision, she adds. 

It is unclear how the salmon will fare on 
the market. AquAdvantage fish produce extra 
growth hormone, allowing them to grow to 
market size in 18 months, rather than the 
usual 3 years. In the time since AquaBounty 
first filed for approval, fisheries have bred con- 
ventional salmon that grow just as fast, says 
Scott Fahrenkrug, chief executive of Recom- 
binetics, an animal- 


biotechnology firm “It opens up the 
in St Paul, Minnesota. possibility of 
Then there isthe harnessing this 


matter of consumer 
acceptance: several 
grocery chains have said that they will not 
carry the salmon, which, even at full produc- 
tion, would amount to only a tiny fraction of 
total US salmon imports. “It’s a drop in the 
bucket,” says Jaffe. “Consumers would have to 
hunt to find salmon that are genetically engi- 
neered, as opposed to avoiding them” 

Still, the FDA’s approval met with swift 
opposition from some environmental and 
food-safety groups. Although AquaBounty 
uses physical and biological safeguards to 
reduce the chance that its salmon will escape 
into the wild, opponents fear that an acciden- 
tal release could alter natural ecosystems. They 


technology.” 


are also unhappy that the FDA will allow the 
fish to be sold without any label to indicate that 
it is genetically engineered. 

“Huge numbers of people have said, “Yes, we 
want it labelled,” says Jaydee Hanson, a senior 
policy analyst at the Center for Food Safety, 
an environmental-advocacy group in Wash- 
ington DC. “If this is such a good product, the 
company itself should be saying it will label it” 

The FDA declined to comment on whether 
other applications for genetically engineered 
animals are in the regulatory pipeline. But 
Fahrenkrug says that his company is develop- 
ing several such animals, including cattle that 
do not have to be dehorned and pigs that do 
not need to be castrated. 

Recombinetics’ animals are engineered 
using genome-editing techniques that Fahren- 
krug argues do not require FDA approval. The 
agency regulates animals that are engineered 
using a “recombinant DNA construct’, but 
his animals are modified by injecting protein 
and RNA into embryos. “It’s a treatment, not a 
transgene, he says. 

The FDA has yet to announce howit will view 
such animals, but Fahrenkrug takes approval of 
the salmon as a sign that the agency is willing 
to allow them onto the market. “I’m feeling 
optimistic now,’ he says. = 


PARIS CLIMATE TALKS 


Pledges raise hopes 
ahead of climate talks 


Momentum builds for a new treaty as world leaders prepare 


to descend on Paris. 


BY JEFF TOLLEFSON 


he road to a new global climate treaty 
Ths been slow and plodding. But years 

of delicate negotiations have given way 
to cautious optimism as more than 190 nations 
prepare for the marathon climate talks that 
begin in Paris on 30 November. 

Some long-running disputes remain, such as 
the debate about what cuts in greenhouse-gas 
emissions can be expected of developing nations 
compared with their developed counterparts. 
But there are many signs that the summit, 
convened by the United Nations, will succeed 
in crafting a global climate agreement. These 
include significant commitments by several 
major players, including the United States and 
China, to reduce emissions of greenhouse gases. 

“We are in for some tense negotiations, but 
I think well come out of the other end with 


an agreement,” says Saleemul Hug, director of 
the International Centre for Climate Change 
and Development in Dhaka, Bangladesh, 
and adviser to a negotiating bloc of the least- 
developed countries. 

And although Paris is still reeling from the 
deadly terror attacks of 13 November, which 
led the authorities to increase security for the 
meeting and cancel a big climate march, more 
than 130 heads of government and state are still 
expected to attend the two-week summit. 

The last major push for a climate treaty 
faltered in Copenhagen six years ago over 
whether developing countries should be 
asked to match developed countries and make 


PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


Ah 
rr 


418 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


voluntary commitments to reduce emissions. 
The political situation has evolved since then 
and more than 165 countries have submitted 
pledges to combat climate change. Although 
these pledges would not cut greenhouse-gas 
emissions enough to meet the UN goal of limit- 
ing global warming to 2 °C above pre-industrial 
levels, they show a level of commitment that 
was missing in Copenhagen. 

“Countries are bringing more political will 
than ever before, and so we'll see if the process 
can deliver,’ says Elliot Diringer, executive 
vice-president of the Center for Climate and 
Energy Solutions, an environmental think tank 
in Arlington, Virginia. “This agreement has 
the potential to be a significant turning point” 

Despite a lingering — and potentially 
volatile — debate about whether those com- 
mitments will be legally binding under inter- 
national law, they are expected to remain 
voluntary. One of the biggest obstacles to a bind- 
ing agreement is the US Senate. On 17 Novem- 
ber, Republican senators pushed through 
legislation seeking to block regulations to limit 
greenhouse-gas emissions from power plants. 
US President Barack Obama can veto these bills, 
but he cannot force the Senate, which has the 
power to reject or approve treaties, to endorse a 
climate agreement that includes binding limits 
on greenhouse-gas emissions. 

As aresult, much of the debate will centre 
on creating mechanisms that allow govern- 
ments — and civil society — to monitor pro- 
gress, build trust and ensure accountability. 


Environmentalists and many governments 
are pushing for a five-year review period that 
would begin immediately after the Paris talks 
end; governments would need to return to the 
table with new commitments in 2020. 

Hug says that this exercise is particularly 
important for poor and vulnerable countries, 
which are pushing for a long-term goal of lim- 
iting warming to 1.5°C. The world is likely to 
cross a landmark threshold, the 1°C mark, 
for the first time in 2015, and Hug admits 
that stabilizing at 1.5°C would require emis- 
sions reductions so drastic as to be politically 


impossible at this point. But world leaders 
should acknowledge that even 2°C of warming 
comes with significant impacts on the world’s 
poorest citizens, he says. “We know we are not 
going to get everything we want in Paris, but 
it’s symbolic” 

Samantha Smith, leader of environmental 
group the WWF's Global Climate and Energy 
Initiative in Oslo, says that the biggest debate 
in Paris will be over financial aid to help poor 
countries to reduce their emissions and cope 
with the impacts of climate change. In 2010, 
wealthy nations established a Green Climate 


IN FOCUS | NEWS 


Fund and committed to increase climate aid to 
US$100 billion annually by 2020. Developing 
countries will be looking for details about that 
commitment and what comes next. 

The good news, Smith says, is that the 
conversation about climate action has changed, 
not just within the negotiations but among 
faith groups, the general public and businesses, 
many of which will make their own voluntary 
emissions commitments in Paris. But she cau- 
tions that a new global treaty is just a first step. 
“When we walk out of there, we are still going 
to have a lot of work to do” m 


ESPEN RASMUSSEN/PANOS 


ENVIRONMENT 


Green Climate Fund 
faces slew of criticism 


First tranche of aid projects prompts concern over operations of fund for developing nations. 


BY SANJAY KUMAR 


ajor questions are swirling around 
Me: operations of a United Nations 

fund that is supposed to channel 
billions of dollars to help developing nations 
adapt to climate change and slow its pace. 

The Green Climate Fund (GCF) was estab- 
lished at UN talks in Canctin, Mexico, five 
years ago, and developing nations see it as one 
of their prime hopes for financial assistance in 
tackling a warming world. 

Yet the fund, which is administered by a 
small team in Incheon, South Korea, is strug- 
gling to raise cash from rich nations. And 
although it approved its first aid commitments 
on 6 November at a meeting in Livingstone, 
Zambia, observers say they are concerned that 
the GCF has cut corners so as to announce 
handouts before international climate talks in 
Paris in December. 

“We are worried about the fund’s social 
and environmental safeguards, consultation 
processes, accountability mechanisms and 
transparency,’ says Brandon Wu, a policy ana- 
lyst who focuses on climate finance at the non- 
governmental organization (NGO) ActionAid 
in Washington DC and who attended the Zam- 
bia meeting. 

The Cancun agreement recommended 
that climate aid total US$100 billion a year 
by 2020, but the balance between private and 
public money, and how much of it would flow 
through the GCE, has not been made clear. 

In the world of climate finance, the GCF 
is a tiny player. If funding for renewable 
energy and energy-efficiency programmes 


Flood barriers in Bangladesh could find support from a United Nations climate fund. 


is included, hundreds of billions of dollars 
already flow round the globe each year, says 
the Climate Policy Initiative (CPI), an interna- 
tional think tank. Still, the GCF is the largest 
international public climate fund. 

The fund’s initial target was to collect 
$10 billion before it started handing out cash, 
which it intends to divide equally between mit- 
igation and adaptation projects. By October, it 
had received pledges of $10.2 billion — which 
foreign-exchange rate variations have reduced 


to $9.1 billion. But only $5.83 billion had been 
formally agreed, and just $852 million had 
reached the fund’s pocket. The United States 
is the most significant missing name from the 
list of donor countries: last year it promised 
$3 billion, but it has yet to sign an agreement 
to contribute money. 

“At this pace we will not be able to do 
anything much,” says Dipak Dasgupta, an 
economist and India’s representative on the 
24-person GCF board. The proposals > 


26 NOVEMBER 2015 | VOL 527 | NATURE | 419 


© 2015 Macmillan Publishers Limited. All rights reserved 


| NEWS IN FOCUS 


> approved in Zambia — $168 million for 
eight climate projects — are “small change’, 
he says. The approvals include a wetlands 
resilience programme in Peru, climate- 
resilient infrastructure in Bangladesh and a 
scheme of ‘green bonds’ to finance sustain- 
able energy ventures in Latin America and 
the Caribbean, but seven of the schemes will 
not receive money until they meet further 
project-specific conditions. 

Developed nations may be reluctant to 
transfer their money to the fund, says Tim- 
mons Roberts, who studies climate change 
and economic development at Brown Uni- 
versity in Providence, Rhode Island. “Many 
developing countries and NGOs believe 
that the funding should all flow through the 
GCF? he says. “However, contributor coun- 
tries have always defended their ability to 
funnel their funds through channels they 
control, whether through their own bilateral 
agencies (like USAID) or through dedicated 
World Bank funds.” 


LACK OF TRANSPARENCY 

There are also concerns about how the GCF is 
run, says Wu, who attended the Zambia meet- 
ing as a permitted ‘civil society observer. Wu 
is worried that indigenous communities were 


PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


iN 
rr 


not adequately consulted before the approval 
of $6.2 million for the Peruvian wetlands pro- 
gramme, for example. GCF documents say that 
a consultation was carried out, but for this and 
for other projects, the fund has no independ- 
ent verification of its claims, says Andrea Rod- 
riguez Osuna, who works in Mexico City for 
the non-profit environmental law organization 
AIDA and was also present in Zambia. 

Nor is the GCF transparent about its pro- 
cesses, Rodriguez Osuna adds. “The fund 
has no information disclosure policy and no 
accountability mechanism, yet the board is 
approving project proposals,” she says. 

For the eight projects approved at the board 
meeting, for example, only proposal docu- 
ments were publicly available (and in the case 
of two private-sector projects, only a sum- 
mary). “These are hardly the unbiased sources 
of information needed to evaluate a project's 
merits or any potential negative impacts,” 
Wu says. Project reviews made by the fund’s 
board and by an independent technical advi- 
sory panel are not publicly released, and GCF 


officials repeatedly failed to answer questions 
asked by Nature for this article. 

For some, another contentious issue is that 
the GCF is flowing its money mainly through 
international organizations, such as multilat- 
eral or private banks such as the World Bank 
and Deutsche Bank — rather than sending it 
directly to institutions in developing countries 
where the projects are taking place. 

The GCF is still new and is seriously under- 
staffed, Rodriguez Osuna adds; and observers 
hope that their worries are teething problems. 
Its executive director, Héla Cheikhrouhou, 
has promised “many more projects under 
development”. 

Claims have already been made that rich 
nations are upscaling public climate funding. 
But experts say that there is little clarity on 
whether the cash is new money, or being re- 
routed from elsewhere, such as from overseas 
development assistance funds. “Definitions 
of what constitutes new money haven't been 
agreed on,’ says Barbara Buchner, who leads 
CPT’ global finance programme in Venice, Italy. 

There is one thing is for certain, Buchner 
says — total finance for low-carbon energy pro- 
jects and for adapting to and mitigating climate 
change is far short of estimates of the need. “We 
need trillions, not billions,” she says. m 


SINCLAIR STAMMERS/SPL 


Brazilian courts tussle over 
unproven cancer treatment 


Patients demand access to compound despite lack of clinical testing. 


BY HEIDI LEDFORD 


court in the Brazilian state of Sao Paulo 
A cut off distribution of a compound 

that is hailed by some as a miracle 
cancer cure — even though it has never been 
formally tested in humans. 


On 11 November, to the relief of many 
cancer researchers, a state court overturned 


2 


MORE 
ONLINE 


earlier court orders that had obliged the nation’s 
largest university to provide the compound 
to hundreds of people with terminal cancer. 
Although the reversal applies only to requests 
for the drug by residents of Sao Paulo state, 
administrators at the university estimate that 
it covers about 80% of the orders they have 
received for the compound. 

The compound, phosphoethanolamine, 


has been shown to kill tumour cells only in 
lab dishes and in mice (A. K. Ferreira et al. 
Anticancer Res. 32, 95-104; 2012). Drugs that 
seem promising in lab and animal studies have 
a notoriously high failure rate in human trials. 
Despite this, some chemists at the University of 
Sao Paulo’s campus in Sao Carlos have manufac- 
tured the compound for years and distributed 
it to people with cancer. A few of those patients 


| MORE NEWS | 
‘Gene drive’ @ Graph-theory breakthrough Daily news, 
mosquitoes tantalizes mathematicians go.nature. commentary and 
engineered com/8mgjdx video updates 
to fight @ The world’s biggest volcano is a from the Paris 
malaria magnetic mix-up go.nature.com/2mgglh climate talks 
go.nature.com/ | @ Ebola experience leaves world no parisclimatetalks2015. 
rodnpf less vulnerable go.nature.com/jxxvs6 tumblr.com 


420 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


NATURE VIDEO 


CECILIA BASTOS/USP 


Phosphoethanolamine capsules were manufactured at the University of Sao Paulo. 


have claimed remarkable recoveries, perpetuat- 
ing the compound’ reputation as a miracle cure. 

Dismayed by this unofficial distribution 
of phosphoethanolamine, the university’s 
administration moved in September 2015 to 
shut it down. Patients took the university to 
court, and in October 2015, Brazil’s Supreme 
Federal Court ruled in favour of one plaintiff 
who wanted the right to try the compound. A 
lower court then began granting orders for the 
university to provide it to others. University 
officials say that they were soon overwhelmed 
by more than 800 requests. 

“The decision not only ignored the opinion 
of medical specialists, but also overlooked the 
fact that the drug has only been tested on ani- 
mals,” says bioethicist Volnei Garrafa at the 
University of Brasilia. “Such court decisions 
bring false expectations for patients and their 
families, creating turmoil in society and con- 
fusion between what is safe and what is not.” 

The Brazilian constitution guarantees 
universal access to health care, and it is com- 
mon in Brazil for patients to turn to the courts 


to access drugs that the state health-care 
system does not dispense because of their cost, 
says Garrafa. But phosphoethanolamine pre- 
sents a different situation, he adds, because it 
is not really a drug at all. It is not approved by 
Brazil’s National Health Surveillance Agency. 
Those who argue that people who are ter- 
minally ill have a right to try experimental 
medicines saw the decision earlier this year 
as a significant victory. But to the university 
administration, drug regulators and cancer 
researchers, it showed blatant disregard for the 
basic scientific principle that a drug should be 
demonstrated to be safe and effective before 
being given to patients outside of a clinical trial. 
“Tt’s a violation of the autonomy of the uni- 
versity,’ says Marco Antonio Zago, a physician 
and president of the University of Sao Paulo. 
“We are seen as a factory to produce something 
that we do not believe should be done” 
Phosphoethanolamine is an important 
building block of the lipids that make up cell 
membranes. The compound can also act as a 
molecular signal that activates certain cellular 


IN FOCUS | NEWS 


processes. Although some studies do suggest 
that the compound may kill cancer cells in 
isolated cells and mice, it is not entirely clear 
how the compound brings about this response. 
Biochemist Durvanei Augusto Maria at the 
Butantan Institute in Sao Paulo believes that 
the compound may be imported into tumour 
cells and, once inside, trigger processes that 
cause the cell to self-destruct. Immunologist 
James Venturini at Sao Paulo State Univer- 
sity and his colleagues have found that phos- 
phoethanolamine may modulate the immune 
system’s response to cancer or affect cell divi- 
sion (M. S. P. de Arruda et al. Braz. Arch. Biol. 
Technol. 54, 1203-1210; 2011). 

But to justify using phosphoethanolamine 
in people, Venturini says, one would have to 
rigorously test it in a series of clinical trials 
using human volunteers. “I strongly believe 
that double-blind, randomized clinical stud- 
ies are necessary,’ he says. 

And even before such trials, further preclini- 
cal studies would have to be done, says Jailson 
Bittencourt de Andrade, secretary for research- 
and-development policy at Brazil’s science and 
technology ministry. The ministry plans to fund 
those studies, he says, and has already asked sev- 
eral research laboratories in the country to do 
the work. If those tests and subsequent clinical 
trials are successful, he says, the ministry will 
also fund the research needed to scale up phos- 
phoethanolamine production to the quantities 
and quality needed for an approved drug. 

That process will take years. In the 
meantime, lawyers representing people with 
cancer have vowed to appeal against the latest 
ruling. If those appeals succeed, de Andrade 
worries that people will not wait until all the 
tests are completed, and may even abandon 
conventional treatment in favour of phospho- 
ethanolamine. “Many patients have come 
forward and said they have tried the drug 
and it has worked for them,” he says. “So the 
other patients and their families — they want 
phosphoethanolamine now.’ mSEE EDITORIALP.410 


TIMEKEEPING 


Leap-second decision delayed 


Nations fail to agree on whether to scrap an adjustment that keeps official 
time in sync with Earth’s rotation. 


BY ELIZABETH GIBNEY 


leap second is gone in the blink of 
A= eye. But a decision on whether 
to ditch these occasional time inser- 
tions — which keep official time synced with 


Earth’s rotation — has been delayed for at 
least eight years. 


This month, the International Telecommuni- 
cation Union (ITU), which bears responsibility 
for defining official Coordinated Universal 
Time (UTC), was expected to reach a consensus. 
But representatives who discussed the issue at 
the World Radiocommunication Conference in 
Geneva, Switzerland, failed to agree on whether 
the leap second’s costs outweigh its benefits. 


Leap seconds, which occur once every few 
years, are necessary because Earths rotation is 
slowing in an unpredictable way. Without them, 
the time of day when the Sun is at the highest 
point in the sky would drift by about one min- 
ute over about 100 years. However, these extra 
seconds have to be programmed into electronic 
systems manually and can upset systems that > 


26 NOVEMBER 2015 | VOL 527 | NATURE | 421 


© 2015 Macmillan Publishers Limited. All rights reserved 


| NEWS IN FOCUS 


> depend on accurate timings. 

Most countries, including China, the 
United States and large parts of Europe, 
favour scrapping the leap second and basing 
utc on the continuous tick ofatomic clocks. 

Official time would slowly move out of 
sync with Earth’s rotation, but — given that 
it would take thousands of years to accu- 
mulate a difference that is greater than the 
shifts already caused by daylight savings 
time — many argue that this would cause 
few problems. “We are already shifted by 
one hour in summer compared to winter 
time,” says Elisa Felicitas Arias, director of 
the Time Department at the International 
Bureau of Weights and Measures (BIPM) in 
Sevres, France, who wants to scrap the leap 
second. “Are we affected because of that?” 
A correction — perhaps a leap minute or 
hour — could be added once the drift is 
appreciable. 

A small number of countries however, 
including Russia and the United Kingdom, 
want to keep the leap second. Russia is 
concerned about how its global navigation 
system, GLONASS — the only one to incor- 
porate leap seconds — would cope, says 
Vincent Meens of France's National Centre 
for Space Studies, and the chair of the ITU 
subgroup tasked with debating the topic. 
Britain’s argument is based largely on the 
desire to keep a link between official time 
and Earth's rotation, says Peter Whibberley, 
a metrologist at the National Physical Labo- 
ratory in Teddington, UK. 

Astronomers are among those who 
would be affected if the leap second were to 
be scrapped. Their software would need to 
cope with Earth's rotational time — which 
defines when stars and galaxies are seen 
in the sky — being offset by more than a 
second from universal time, says Meens. 

On 18 November, the ITU announced 
that it would defer a decision until 2023 
when it will have more information on the 
impacts of losing the second. 

The union did, however, decide to make 
changes to the international treaty that cur- 
rently defines utc, and in turn the leap 
second. Rather than having a stand-alone 
definition of uTC, the treaty will cite an SI 
definition, and mention of the leap second 
will move to become part of a ‘description 
of urc ina subsidiary section of the treaty 
that expires in 2023. 

Whibberley says that the effect will be 
to remove responsibility for utc from the 
ITU, and that the General Conference on 
Weights and Measures (CGPM) — which 
is already responsible for defining SI units, 
including the second — is most likely to 
become the authority in the future. But 
the change is unlikely to speed up the deci- 
sion on whether to scrap the leap second: 
the CGPM’s next chance to even propose a 
change is not until 2018. = 


422 | NATURE | VOL 527 | 26 NOVEMBE 


Decades of studies on chimpanzee brains and behaviour will be captured in an online resource. 


BIOMEDICAL RESEARCH 


Chimps retire to 
a digital world 


NIH to fund a cache of brain tissue and online data in 
place of live-animal experimentation. 


BY SARA REARDON 


anzee the chimpanzee was a skilled 
Pp communicator that could tell untrained 
humans where to find hidden food by 
using gestures and vocalizations. Austin the 
chimp was particularly adept with a computer, 
and scientists have been scanning its genome 
for clues to its unusual cognitive abilities. 
Both apes lived at a language-research centre 
at Georgia State University in Atlanta, and both 
died several years ago — but they will live on in 
an online database of brain scans and behay- 
ioural data from nearly 250 chimpanzees. 
Researchers hope to combine this trove, now 
in development, with a biobank of chimpan- 
zee brains to enable scientists anywhere in the 
world to study the animals’ neurobiology. 


R 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


This push to repurpose old data is espe- 
cially timely now that the US National Insti- 
tutes of Health (NIH) has decided to retire its 
remaining research chimpanzees. The agency 
decommissioned more than 300 animals in 
2013, but kept 50 available for research in case 
of a public-health emergency. Following an 
18 November decision, this remaining popula- 
tion will also be sent to sanctuaries in the com- 
ing years. The NIH also hopes to retire another 
82 chimps that it supports but does not own, 
says director Francis Collins. 

“We were on a trajectory toward zero, and 
today’s the day we're at zero,’ says Jeffrey Kahn, 
a bioethicist at Johns Hopkins University in 
Baltimore, Maryland, who led a 2011 study 
on the NIH chimp colony for the Institute of 
Medicine. 


VINCENT J. MUSI/NATL GEOGRAPHIC CREATIVE 


The NIH’s latest move, along with a decision 
in June by the US Fish and Wildlife Service 
to give research chimps endangered-species 
protections, effectively ends the possibility 
of biomedical research on the animals in the 
United States. 

The retirement of the NIH chimps will also 
end non-invasive studies on the 139 NIH- 
owned animals at the University of Texas MD 
Anderson Cancer Center primate facility 
in Bastrop. Its director, Christian Abee, says 
that researchers have published more than 
50 behavioural studies since 2012 using these 
animals. “There is no other alternative for 
cognitive research in chimpanzees,’ he says. 

That makes the NIH-funded chimp 
database all the more important. “This is a 
very unique window of opportunity to make 
sure that there's a legacy and a contribution 
from the lives they have lived,” says project 
leader Chet Sherwood, a biological anthro- 
pologist at George Washington University in 
Washington DC. 


ONLINE LEGACY 

In the next few months, Sherwood’s team 
plans to launch a website with a database for 
researchers and an educational component 
for the public. The site will eventually include 
existing data on the chimps’ performance in 
behaviour and personality tests, scans of the 


primates’ brain structure and activity, and 
their pedigrees and some genetic information. 
Sherwood and his colleagues plan to model the 
website on that of the Human Connectome 
Project, an open-access collection of brain 
scans from 1,200 individuals that researchers 
can use to study the links between brain struc- 
ture and activity and human traits. 
The team is also 


collaborating with “Thisis aunique 
the Allen Brain Insti- window of 

tute in Seattle, Wash- opportunity 
ington, tocreatean [0 make sure 
atlas of gene expres- that there’sa 
sion in the chimp legacy anda 
brain. Researchers contribution 
who want to study from the lives 
chimp brains in more they have lived.” 
detail can request tis- 


sue and blood samples from the team, which 
has nearly 250 preserved organs stored at facil- 
ities in Washington DC and Atlanta. 

But some scientists and advocates worry 
about the consequences of losing access to 
research chimps. Frankie Trull, director of 
the Foundation for Biomedical Research in 
Washington DC, which advocates for animal 
research, says that the US government may 
regret its decision if a public-health threat 
emerges that would be best studied in chim- 
panzees. Others caution that the dwindling 


IN FOCUS | NEWS 


number of research animals will make it dif- 
ficult to develop therapies — such as vaccines 
against Ebola — for wild chimps, which would 
help both the animals and human beings. 

In the meantime, the NIH is struggling to 
find homes for its newly retired chimps. By law, 
retired animals are sent to a federal sanctuary 
known as Chimp Haven in Keithville, Louisi- 
ana, but that facility has only 25 places available 
now. Nearly 310 NIH-owned animals need to 
be resettled, and Collins says that the agency 
is still evaluating its options — a situation that 
worries lawmakers. 

On 20 November, two members of Con- 
gress sent the NIH a letter asking the agency 
for its plan to rehome the remaining chimps. 
“We want to make sure that for the sake of 
taxpayers and these much-abused chimpan- 
zees, these delays are overcome immediately,’ 
they wrote. 

Although retired, the apes of Chimp Haven 
may one day re-enter research labs — posthu- 
mously. Sherwood’s team is drafting an agree- 
ment with the sanctuary to obtain the animals’ 
brains when they die; it also hopes to acquire 
organs from chimps in zoos and research 
facilities. “You can imagine 20 years from now, 
this ageing population won't be here,” he says. 
“If we weren't making the efforts today, there 
wouldn't be a way to study neurobiology in 
chimpanzees.” = 


© 2015 Macmillan Publishers Limited. All rights reserved 


ALL TOGETHER NOW 


After 25 years of negotiations, all countries are finally set to take steps 
to limit global warming. A special issue examines the path to the 
Paris climate summit, and the road beyond. 


ILLUSTRATION BY DAVID PARKINS 


hen more than 190 nations gather in Paris on 
W 30 November to broker an agreement to mitigate cli- 
mate change, it will be a turning point for the planet. 
A successor to the 1997 Kyoto Protocol has been a long 
time coming. A previous attempt to shape a global agreement 
fell apart in 2009, at talks in Copenhagen. Now the world is 
ready to try again, and for the first time, all countries are 
poised to take action (see page 418). But the history here is 
sobering: the quest to build a global climate treaty has hit 
many obstacles over the past 25 years. Its dramatic story is 
chronicled in a comic starting on page 427. 
Although the United Nations aims to limit global warming 
to 2 °C, a News Feature on page 436 reveals that this will be 
much harder than many studies have 


David Victor and James Leape (see page 439). But Johan 
Rockstrém, director of the Stockholm Resilience Centre, 
argues on page 411 that Paris will be a success if it shows that 
the world is serious about addressing the climate problem. 
To explore the backstory to the talks, historian Adam Rome 
reviews seminal books on sustainability from the 1960s and 
1970s (see page 443). A News story explores the challenges 
facing the Green Climate Fund, a UN mechanism to help 
developing nations adapt to climate change (see page 419). 
Online, Nature presents videos about the climate summit, 
as well as other unique material, at www.nature.com/ 
parisclimate. We will also cover the talks as they happen. 
Any agreement reached in Paris will not solve the climate 
problem, but it could lay a solid 


indicated. Part of the difficulty will j 
be ensuring that any treaty leads to | 
actions with lasting global momen- 
tum, say climate-policy experts 


rN 
rr 


PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


foundation for collective action 
(see page 409). The quest to save 
the planet will continue for many 
more years. = 


26 NOVEMBER 2015 | VOL 527 | NATURE | 425 


© 2015 Macmillan Publishers Limited. All rights reserved 


Can nations unite to save 
Earth’s climate? 


A COMIC BY 
RICHARD MONASTERSKY 
AND NICK SOUSANIS 


PARIS CLIMATE TALKS 


A A Nature special issue 
\ | nature.com/parisclimate 


——- 


_— 


ie WHEN THE WORLD’S NATIONS GATHER IN PARIS THIS 

é DECEMBER TO NEGOTIATE A CLIMATE TREATY, THEIR 
EFFORTS WILL CAP A 25-YEAR-LONG JOURNEY 
PLAGUED BY DETOURS AND DEAD ENDS. 


\y YY 


THE QUEST STARTED IN 1990 WHEN 
THE UNITED NATIONS LAUNCHED 
TALKS AIMED AT PRODUCING THE 
FIRST GLOBAL CLIMATE AGREEMENT. 


‘4 ) A, 


NATIONS GATHERED FOR THE EARTH 
SUMMIT IN RIO DE JANEIRO, BRAZIL. 


TUNTTIED RATIONS 


TRANEWURK GUNVER TION CL. 


1H.U) @- Pins i WHICH 

[H i | | |! | IN RIO, THEY ADOPTED THE UNITED l INI \ | PECLEREP: 

Ww ay ELWOTe catica eee Qs P The ultimate objective of this Convention 

7 ... iS to achieve ... stabilization of 
greenhouse gas concentrations in the 
atmosphere at a level that would prevent 
dangerous anthropogenic interference 
with the climate system. 


ns f Jsa3 09 JE abd ye Gaile Glu! 
ay} * 


THE RIO CONVENTION WAS A HISTORIC STEP, 
BUT IT CONTAINED NO BINDING 
COMMITMENTS TO SLOW GLOBAL WARMING. 


Lo 
Uf 


ion 


Carbon (billion tonnes) 


MAURICE STRONG, 
ORGANIZER OF 
THE RIO SUMMIT 


2 | ©NATURE 2015 


FOR SOURCES, CREDITS AND FURTHER READING, SEE: GO.NATURE.COM/EXUQOT 


¢ MARS: WEAK 

8 WHAT IS THE GREENHOUSE EFFECT? SOME OF THAT ENERGY GREENHOUSE EFFECT, 
WATER VAPOUR, CARBON DIOXIDE AND Mae WARMS THE ATMOSPHERE. AVG TEMP -63 °C 
OTHER GASES IN THE ATMOSPHERE 
KEEP THE PLANET WARMER THAN IT 
WOULD OTHERWISE BE. 


EARTH: ENHANCED 
GREENHOUSE EFFECT, 
Aa, Ke, a AVG TEMP 15 °C AND 
GREENHOUSE GASES AS RISING 
ABSORB AND RE-EMIT 
INFRARED RADIATION. €: 
J 


\ . Se ee 
VENUS: EXTREME 
GREENHOUSE EFFECT, 
SOLAR RADIATION 


BY ADDING EXTRA CO., METHANE AVG TEMP 460 °C 
WARMS. EARTH'S AND OTHER POLLUTANTS, 


HUMANS ARE STRENGTHENING es 


} d 
WHICH RADIATES THE GREENHOUSE EFFECT. 
ai Nrearen enezey. 


IN 1896, THE SWEDISH SCIENTIST SVANTE ARRHENIUS 
CALCULATED HOW CHANGES IN THE AMOUNT OF CO, **| Global temperature trend 
IN THE ATMOSPHERE COULD WARM OR COOL EARTH. 

LONDON, EDINBURGH, axp DUBLIN 
PHILOSOPHICAL MAGAZINE 
AND 
JOURNAL OF SCLENCE. 
APRIL 1896, 


THE CHANGES CAME 
MUCH FASTER THAN , 
ARRHENIUS ANTICIPATED. |}: 


difference in temperature °C 
relative to 1951-80 average 


« On the Lijluence of Car! 
Temperate 


bonie Acid in the Air upy 
the Ground. _By Prof. Sva 


HE LATER SUGGESTED HUMANS WERE 
RAISING THE PLANET’S TEMPERATURE 
AND IT WOULD BECOME NOTICEABLE 

IN A FEW CENTURIES. 


ON 23 JUNE 1988, NASA 
SCIENTIST JAMES 
HANSEN TOLD A US 
SENATE HEARING THAT 
HUMANS WERE HAVING A 
CLEAR IMPACT BY 
BURNING FOSSIL FUELS 
SUCH AS COAL, OIL AND 
NATURAL GAS. 


tel oul the onormous 
Whim it was chiefly the did 


Tance VIL—¥ ¥ of 4 


Average CO: 


concentration 
295 parts per 
million (p.p.m.) 


“THE GREENHOUSE 
EFFECT HAS BEEN 
DETECTED, AND IT 
IS CHANGING OUR 
CLIMATE NOW.” 


IT WAS A WAKE-UP 
CALL TO THE WORLD. 


SA] THE COUNTRY WAS 
SZ] SUFFERING ONE OF ITS 
Y) WORST DROUGHTS EVER, 


Wee AY, 


S 


Y 
| 


ACTIVIST CHICO MENDES 
DREW ATTENTION TO THE 

RAMPANT DESTRUCTION OF 
THE AMAZON FOREST. 


FOSSIL FUELS ARE NOT THE ONLY 
CAUSE OF WARMING. DEFORESTATION 
ALSO CONTRIBUTES BY RELEASING 
THE COz STORED IN TREES. 


Landsat satellite images show =./ 
forest loss in Rondé6nia, Brazil. 7 ~~ 


ONATURE 2015 | 3 


ALARMED BY THE GROWING : uo AT THE IPCC’S FIRST MEETING, THE DIRECTOR OF 
PROBLEM, THE UNITED NATIONS YG", ty : = THE UNITED NATIONS ENVIRONMENT PROGRAMME, 
CREATED THE INTERGOVERNMENTAL \ Zh MOSTAFA TOLBA, IMPLORED SCIENTISTS TO USE 
PANEL ON CLIMATE CHANGE (IPCC) P ¥ Z THE TIME LEFT IN THE CENTURY — JUST 4,000 

IN 1988 TO ASSESS THE ISSUE. yl) Te ; DAYS - TO DEAL WITH CLIMATE CHANGE. 


HOPES RE HIGH BECAUS'! 
Oe Wee Gee eres . IN THE CASE OF GLOBAL WARMING, 


TAKEN STEPS TO SOLVE ON EVERYONE HAS A HAND IN THE PROBLEM 
cpap eld allly ore e BECAUSE SO MANY ACTIVITIES GENERATE 
GREENHOUSE GASES. 


IN 1987, NATIONS ADOPTED A TREATY 
TO PROTECT THE OZONE LAYER. 


BUT REACHING THAT AGREEMENT 
WAS RELATIVELY EASY BECAUSE / ; RS 

ONLY A HANDFUL OF COMPANIES IN exe unica ieeineriieg 
A FEW COUNTRIES PRODUCED 

OZONE-DESTROYING COMPOUNDS. —F cdo IS INFINITELY MORE 


IN ITS FIRST REPORT, THE IPCC 
FORECASTED THAT IF CURRENT 
TRENDS CONTINUE UNTIL 2100, 
THE WORLD WOULD BE 4°C 
WARMER THAN IT WAS IN 1850. 
SWELLING OCEANS WOULD BE 
A MAJOR PROBLEM ... 


Se) bal 
IPCC 1990 projected \ ; 3% ‘a . BECAUSE HALF OF HUMANITY 
sea-level rise: INHABITS COASTAL REGIONS. 


8 


' 
t 


s 


estimate 


Centimetres 


! 
' 
i 
| fest 
\ 
' 


aS A MONSTROUS CYCLONE DROVE THAT POINT 
$l HOME IN 1991 WHEN IT KILLED MORE THAN 
140,000 PEOPLE IN BANGLADESH. - 


THE RIO TREATY WAS CLEARLY NOT 
ENOUGH, SO NATIONS GATHERED 
IN 1995 IN BERLIN TO NEGOTIATE 
A STRONGER ACCORD. 


2 Y HAD TO ACT FIRST BECAUSE THEY 
ae, Lf" W\\74 HAD CAUSED THE PROBLEM 
BUT THE ASSEMBLED COUNTRIES \ 
COULDN'T AGREE ON SPECIFICS. 
La 


rr - > r i y Mj) DURING ALL-NIGHT NEGOTIATIONS, | 
(a Z MB ype\ > Y) GERMANY’S ENVIRONMENT MINISTER, 
MOST OF THE PROBLEM - TO le ANGELA MERKEL, BROKERED A 
4 CUT EMISSIONS BY 20%, \ DEAL. COUNTRIES WOULD HAVE TWO 
YEARS TO AGREE ON EMISSIONS 
LIMITS FOR DEVELOPED NATIONS. 


4 | ONATURE 2015 


IN DECEMBER 1997, COUNTRIES GATHERED IN 
KYOTO, JAPAN, TO HASH OUT A NEW TREATY. BUT 


THEY COULDN’T AGREE ON HOW MUCH DEVELOPED 
NATIONS SHOULD TRIM THEIR EMISSIONS. 


AFTER WORKING THROUGH THE 
FINAL NIGHT, NEGOTIATORS 
REACHED AN AGREEMENT 


_—— 


{ a 4 % 
LAND = 
16 NATIONS De, dD Ws 


i CALLED THE KYOTO PROTOCOL. 
= NOs. RA 15% cy | F IT WAS THE FIRST TIME THAT 
A 20% f : i 4 i COUNTRIES PROMISED TO REIN 
aor oeee curt j i Aj IN GREENHOUSE-GAS POLLUTION 
pPAN PF A 5% at i ( ii BY SPECIFIC AMOUNTS. ) 
a CUT k j ' iz rm j 
— f 


ave US WANTED DEVELOPING Countries TO ACT: TOO” 


THE KYOTO PROTOCOL 
SPLIT THE WORLD IN 
TWO: INDUSTRIALIZED 
COUNTRIES WITH 
EMISSIONS LIMITS ... 


THE PROTOCOL ALSO ALLOWED FOR 
FLEXIBILITY IN HOW COUNTRIES MET 

THEIR COMMITMENTS. DEVELOPED 
NATIONS COULD GET CREDIT FOR 
REDUCING EMISSIONS IN POORER ONES. 


[ SE 


.-- AND DEVELOPING 
COUNTRIES WITHOUT. 


DEVELOPED COUNTRIES PROMISED TO CUT THEIR OVERALL 
EMISSIONS TO 5.2% BELOW 1990 LEVELS FOR THE 
PERIOD 2008-12. EACH COUNTRY HAD ITS OWN TARGET. 


Iceland +10% 


THE US REFUSED TO RATIFY THE 
PACT BECAUSE OF CONCERNS THAT 
ITS ECONOMY WOULD SUFFER WHILE 
DEVELOPING NATIONS INCREASED 

THEIR POLLUTION WITHOUT LIMITS. 


Australia +8% 
Norway +1% 


Russian Federation 0% 


Canada -6% 
Japan -6% 


BUT THE CRACKS \W J 
IN THE TREATY a 1a SS 
sfrcrecin’ tier ie WERE CLEAR iZ , IN 2001, US PRESIDENT GEORGE W. BUSH 
p 6 FROM THE START. Wi, REJECTED THE AGREEMENT, SAYING “THE 
KYOTO PROTOCOL WAS FATALLY FLAWED 
IN FUNDAMENTAL WAYS.” 


US -7% 


Cumulative'-5.2% 
-10 

SOON, WORLD EVENTS MADE CLEAR HOW Planet (S Baily 

LIMITED THE PROTOCOL WAS. IN 2006, 


CHINA PASSED THE US TO BECOME THE 
WORLD’S LARGEST CARBON EMITTER. 


Planet (B Baily 


2010 HOTTEST YEAR ON RECORD 


CANADA FORMALLY WITHDREW 
FROM KYOTO IN 2011. 


3.0 —__ 


_ 251 COz emissions Planet hy) Daily 
8 ae 
€ 2.0} 2005 HOTTEST YEAR ON RECORD 
c 
= 1.5} fi i} ] Pd 
a m 
2 fit : MEANWHILE, GLOBAL 
& 05} India Wj A J TEMPERATURES 
* WASHINGTON ~ Earth was warmer in 1998 than in any hy ED TO 
O* = =f ; —. : g i other year on record, climate scientists reported Thursday. aver 5 SaTary, Coppi 
1960 1970 1980 1990 2000 2010 f i Sa cE ad Soa acim oares By Trees : 


ONATURE 2015 | 5 


THROUGHOUT THE CLIMATE 
NEGOTIATIONS, SCIENTISTS HAVE 
TRIED TO SHOW WHAT KIND OF 
WORLD AWAITS FUTURE 
GENERATIONS IF GLOBAL 
WARMING CONTINUES. 


SUCH FORECASTS COME FROM 
COMPLEX CLIMATE SYSTEM 
MODELS, WHICH DIVIDE THE 
GLOBE INTO MILLIONS OF CELLS. ... 


| / 


i cl } / | pss 
/ / ITA LT | _k aa 
... AND SIMULATE THE |// ” 
ATMOSPHERE, OCEANS | (// 
| AND BIOSPHERE IN 3D. 
YW li WVU & 
VV IG | 
RESEARCHERS HAVE CONFIDENCE 
IN THEIR MODELS BECAUSE THEY 


CAN REPRODUCE FEATURES OF 
PAST AND CURRENT CLIMATES. 


Model 
Observations 


op er yg Projected temperature change in 2100 


for a mean global warming of 4°C 
a SE Ss 


Temperature anomalies (°C) 


= ALTHOUGH SCIENTISTS AGREED HUMANS WERE WARMING 
< THE PLANET, SOME POLITICIANS DENIED THAT FACT. 
J [BONA 


“WITH ALL OF THE HYSTERIA, ALL OF THE FEAR, F “WE SHOULD INVESTIGATE THE WELL-FUNDED 
ALL OF THE PHONY SCIENCE, COULD IT BE THAT EFFORT BY CERTAIN OIL COMPANIES TO 

. MAN-MADE GLOBAL WARMING IS THE GREATEST rf 

> HOAX EVER PERPETRATED ON THE AMERICAN Ly ON THE REALITY OF GLOBAL WARMING.” 

jy — US REPRESENTATIVE HENRY WAXMAN, 2006 


\ \ 


IN 2003, EUROPE 
SUFFERED A 4,000 
PROLONGED HEAT 
WAVE THAT KILLED 
AN ESTIMATED 
70,000 PEOPLE. 


THE IMPACTS WERE GETTING 
CLEARER. THE IPCC DECLARED IN 
2007: “WARMING OF THE CLIMATE 
SYSTEM IS UNEQUIVOCAL.” 

2,000 


Excess deaths per day 


LATER THAT YEAR, THE IPCC 
WAS AWARDED THE NOBEL 
PEACE PRIZE FOR ITS EFFORTS. 


31 July 


WHILE THE SCIENCE RACED f y FOR THE FIRST TIME, DEVELOPING 
AHEAD, THE NEGOTIATIONS / NATIONS AGREED TO “MITIGATION BN 
DRAGGED ON. NATIONS MET 1, | \ ACTIONS” OF THEIR OWN CHOOSING |W 
IN BALI IN DECEMBER 2007. : y / TO LIMIT CLIMATE CHANGE. 


7 eens hae [ { = 4 “ ~= 


THE TALKS WERE SO FRACTIOUS THAT 
=] AT ONE POINT, THE CHAIRMAN BROKE 


\ SET FOR A TREATY IN 2009 THAT == Y 
WOULD INCLUDE NEW COMMITMENTS De ene nie OCLE 
BY DEVELOPED COUNTRIES. ADDRESS DEFORESTATION. 


Ck AN ill 


6 | ©NATURE 2015 


IN THE RUN-UP TO THE 2009 


Carbon budget 
already used ot 


TO LIMIT 
GOAL THAT WOULD AVERY’ i 
CATASTROPHIC CHANGE. ANG whe, Bu Benne ‘= 

r=) 3 BILLION TONNES A YEAR. 


2) 


ORR SAAN 
feelers LANNY Wer) 
eo ana 2014 CO2 
Hy bi, aye} & 
oN ! nt PUN: concentration 
vas Wy BY .p.m. 
FOR TI sos cas) PP. 
EMISSIONS CAPS. ay AMV 9 BNE ey Nia 


ea 0 

en NN ein ee sire 

WO . 

AANA Heh Gesunde ’ 

with \ ae { Ue | ) 
s ye 


SOME DEMANDED 
REDUCING CO2 LEVELS TO 
50 P.P.M., WHICH WOULD 
Cir FUTURE W WARMING. 
AND LESSEN THE RISKS OF 
DANGEROUS IMPACTS 


EXTREME SEA-LEVEL RISE 
AND MEGA-DROUGHTS. 


DESPITE THE FRENZY OF ATTENTION, 
THE COPENHAGEN NEGOT! rs IONS 
FAILED TO DELIVER A TREATY. 


WESTERN NATIONS BLAMED 
CHINA FOR BLOCKING 
SUBST, ANTIVE EMISSIONS LIMITS. 


DEVELOPING COUNTRIES 
CHARGED THAT THEY HAD 
BEEN LEFT OUT OF 
CRUCIAL DISCUSSIONS. 


Vite NATIONS MET 


N ENCOUN AN! Cee THE 


OUNTRIES ARE EX 
‘O CONTRIBUTE USS100 ANP FOR aad FIRST 7H ALL 
ILLION A Mon TO THE OUNTRIES ED TO UCE 
FUND BY 202 EMISSIONS EA TO Tre |Y 
DIFFERE! iL Ree Nee LITIES 


Lh CAPACIT 


QS ey 


(4 


ONATURE 2015 | 7 


Ane S THE NEGOTIATIONS 
WLED 


ALONG, 
ay WO D HAS HURTLED 
ROUGH CHANGES. 


8 Sear, WAVE IN RUSSIA IN ed SLE. 


UGHLY 55,000 PEOPL! 
iS 


<A WILDFIRES ACROSS THE EOUNTEY: 


IN OCT OBES ye THE 
GLOBAL POPULATION 
TOPPED 7 BILLION PEOPLE. 


IN 2012, 97% OF THE GREENLAND 
ICE SHEET S ED SIGNS OF 
MELTING — THE FIRST TIME tr 
AN EXTENSIVE AREA HAD THAWE! 


| THE STRONGEST N RECORD Ti LANDF; 
WHEN IT. SLAMMED INTO THE PHIL! ot WITH 
WINES: OF 315 KILOMETRES PER HOU 


CLIMATE TALKS IN 2013 
PHS PINE DELEGATE N, ADEREY 
ASANO BROKE DOWN Psa 
COUNTRY’S DEVASTATION ANE 
RESS IN 


THE LACK OF 
NEGOTIATIONS. 


™% Arctic sea ice Y 
September. 2012 © ; 


WE MARY. 
OF VULNERABLE COUNTRIES.” 


JAMES HANSEN, WHO HAD 
THE ALARM 


THE ars TIME IN 
ie MILLIONS OF YEARS. nw SLOBAL WN DSeVERA Th — 
PROTESTS OVER THE PROPOSED 
g KSYSTONE XL PIPELINE. 


2014: CHINA Cher Toe THE 
EUROPEAN UNION IN PER 
CAPITA EMISSIONS OF COz. 


THE US AND CHINA BROKERED A 
HISTORIC AL IN 


A ST LS a 
06 0402 0 02 04 06 08 10 12515 1.75 25 


IN SEPTEMBER 2013, THE IPC 
REPORTED THAT “HUMAN INFLUENCE 
ON THE CLIMATE SYSTEM IS CLEAR.” 


TOC 
Fe ee ene. EMISSIONS 
26-28% BELOW 2005 LEVELS BY 
COEF, Po ey S ID ip oe EMR ONS 


THAN SO COUNTRIES HAVE 
SUBMITTED THEIR PLEDGES. 


WITH TH NEARLY ONE THD OF ITS POPULATION STILL 

LACKI ECTRICITY, INDIA SAYS IT CANNOT YET 

CUT ITS COz EMSs IONS. PRIME MINISTER 
NAREND! ODI" 


IMENT_PLEDGED TO 
SUBSTANTIALLY INCREASE ENERGY EFFICIENCY. 


2014 SET THE RECORD AS THE 
HOT EP) YEAR 


GLOBAL LAND (Mes Pan 
SURFACE TEMPERATURES. 


8 | ONATURE 2015 


EVER peor 


Mie CALLS FOR, ACTION - AND THE NF 
WARNING SIGNS — GROW STRONGER. | \o™ 
\\ FX \ Val pa 
Ss Yj, .\Y 44 > ——= = 


“CLIMATE CHANGE IS A 

GLOBAL PROBLEM WITH 

GRAVE IMPLICATIONS .... IT 

OF THE 
PRINCIPAL CHALLENGES 

a] FACING HUMANITY IN OUR DAY.” 


NATIONS WILL FACE A TEST 
AGAIN WHEN THEY MEET IN 
PARIS IN DECEMBER: CAN THEY 
TAKE SIGNIFICANT STEPS TO 

LIMIT CLIMATE CHANGE? 


Annual global temperature (land and ocean) 


rd 


\AI “ALTHOUGH WERE MOVING " 
\Y] IN THE RIGHT DIRECTION, IT \" 
| IS CLEARLY NOT ENOUGH.” 


YANN 
AN) 


ONCE AGAIN, GLOBAL 
TEMPERATURES WILL 
SET _A RECORD. 2015 IS 
ON PACE TO TOP 2014. 


y IN 


FIRES IN INDONESIA RAVAGED THE 

COUNTRY AND PUMPED HALF A BILLION 

.| TONNES OF CARBON INTO AIR, 
MORE THAN JAPAN PRODUCES IN A YEAR. 


i 


j / — ai ~~ a : — “ 7, fst 
THE PARIS PLEDGES WILL PROBABLY LIMIT 
. BUT MUCH 


WARMING TO BELOW 3 °C. 
STRONGER ACTION IS NEEDED TO STAY 
| UNDER 2°C. COUNTRIES WILL PROBABLY 
| BLOW THROUGH THE TRILLION-TONNE 

| CARBON BUDGET BEFORE 2040. 


f i 


COz CONTINUES TO. 
THICKEN THE SKIES. « 
CONCE TIONS MAY |") 
NEVER AGAIN DRO} toa 
BELOW 400 P.P.M. 


im 


S 


EVEN IF GLOBAL WARMING DOES 
NOT PASS 2°C, THE WORLD MIGHT 
STILL FACE CALAMITOUS IMPACTS, 
LIKE PARTS OF THE ANTARCTIC 
ICE SHEET SLIDING INTO THE 

| OCEAN WITHIN A FEW CENTURIES. 


\! 


THE WORLD HAS COME A LONG WAY 
SINCE KYOTO. THIS YE; ALL COUNTRIES 
ARE SETTING THEIR OWN GOALS, WHICH 
MAY MAKE THE TARGETS MORE REALISTIC. 
AND NATIONS MAY AGREE TO STEP UP 
THEIR COMMITMENTS PERIODICALLY. 


¥. ‘ OE AON : ‘ 
THE FRAMEWORK TO FIX THE PLANET ——— 
1 IS COMING TOGETHER, BUT IT IS \ 
FRAGILE AND FAR Ti L. THE 
JOB OF FINISHING THE TASK WILL 
FALL TO FUTURE GENERATIONS. 


x 


THE FRAGILE 
FRAMEWORK 


ORIGINALLY PUBLISHED IN 
NATURE 527, 427-435 (2015) 


Paris climate talks bring high hopes for 
anew globalagreement Pace42s 


‘© NATURE.COM/NATURE 
Nowmber 2015 £10 


‘TL i 


SPACESHIP ULTRASOUND 
IN D 


ANTIBODY 
(eve) ON 
se ibodie: 


Sur w of Thesea 
cine i 


For more on the Paris climate talks, 
see: nature.com/parisclimate 


The2°C dream 


s have pledged to limit global warming to 2°C, and climate models 
at is still possible. But only with heroic — and unlikely — efforts. 


BY JEFFTOLLEFSON 


gathered for the historic climate summit in Paris at the end of 2015. Nearly 

8.8 billion people now crowd the planet. Energy consumption has nearly doubled, 
and economic production has increased more than sevenfold. Vast disparities in wealth 
remain, but governments have achieved one crucial goal: limiting global warming to 
2°C above pre-industrial temperatures. 

The United Nations meeting in Paris proved to be a turning point. After forging a 
climate treaty, governments immediately moved to halt tropical deforestation and to 
expand forests around the globe. By 2020, plants and soils were stockpiling more than 
17 billion tonnes of extra carbon dioxide each year, offsetting 50% of global CO, emis- 
sions. Several million wind turbines were installed, and thousands of nuclear power 
plants were built. The solar industry ballooned, overtaking coal as a source of energy in 
the waning years of the twenty-first century. 

But it took more than this. Governments had to drive emissions into negative terri- 
tory — essentially sucking greenhouse gases from the skies — by vastly increasing 
the use of bioenergy, capturing the CO, generated and then pumping it underground 
on truly massive scales. These efforts 
pulled Earth back from the brink. Atmos- 
pheric CO, concentrations peaked in 2060, PARIS CLIMATE TALKS 
below the target of 450 parts per million A Nature special issue 
(p.p.m.) and continue to fall. 


T he year is 2100 and the world looks nothing like it did when global leaders 


NIK SPENCER/NATURE 


436 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


hat scenario for conquering global warming is one 
T possible — if optimistic — vision of the future. It was 

developed by modellers at the Joint Global Change 

Research Institute in College Park, Maryland, as part ofa 

broad effort by climate scientists to chart possible paths 
for limiting global warming to 2°C, a target enshrined in the UN climate 
convention that will produce the Paris treaty. 

Climate modellers have developed dozens of rosy 2°C scenarios over 
several years, and these fed into the latest assessment by the Intergov- 
ernmental Panel on Climate Change (IPCC). The panel seeks to be 
policy-neutral and has never formally endorsed the 2-degree target, 
but its official message, delivered in April 2014, was clear: the goal is 
ambitious but achievable. 

This work has fuelled hope among policymakers and environ- 
mentalists, and it will provide a foundation for debate as governments 
negotiate a new climate agreement at the UN's 2015 
Paris Climate Conference starting on 30 Novem- 
ber. Despite broad agreement that the emissions- 
reduction commitments that countries have 
offered up so far are insufficient, policymakers 
continue to talk about bending the emissions curve 
downwards to remain on the path to 2 degrees that 
was laid out by the IPCC. 

But take a closer look, some scientists argue, 
and the 2°C scenarios that define that path seem 
so optimistic and detached from current political 
realities that they verge on the farcical. Although the caveats and uncer- 
tainties are all spelled out in the scientific literature, there is concern that 
the 2°C modelling effort has distorted the political debate by obscuring 
the scale of the challenge. In particular, some researchers have questioned 
the viability of large-scale bioenergy use with carbon capture and stor- 
age (CCS), on which many models now rely as a relatively cheap way to 
provide substantial negative emissions. The entire exercise has opened up 
arift in the scientific community, with some people raising ethical ques- 
tions about whether scientists are bending to the will of politicians and 
government funders who want to maintain 2 °C as a viable political target. 

“Nobody dares say it’s impossible,” says Oliver Geden, head of the Euro- 
pean Union Research Division at the German Institute for International 
and Security Affairs in Berlin. “Everybody is sort of underwriting 
the 2-degree cheque, but scientists have to think about the credibility 
of climate science.” 

Modellers are first to acknowledge the limits of their work, and say that 
the effort is designed to explore options, not predict the future. “We'll tell 
you how many nuclear power plants you need, or how much CCS, but we 
cant tell you whether society is going to be willing to do that or not,’ says 
Leon Clarke, a senior scientist and modeller at the Joint Global Change 
Research Institute. “That's a different question” 


ONE TRILLION TONNES 

The idea of limiting global warming to 2 °C dates back to 1975, when 
economist William Nordhaus of Yale University in New Haven, Con- 
necticut, proposed that more than 2 or 3 degrees of warming would push 
the planet outside the temperature range of the past several hundred thou- 
sand years. In 1996, the EU adopted that limit, and the Group of 8 (G8) 
nations signed on in 2009. The parties to the UN convention on climate 
change affirmed the target in 2009 at their Copenhagen summit, and then 
formally adopted it a year later in Cancun, Mexico. 

The move caught scientists off guard. Before 2009, most modellers 
had focused on scenarios in which atmospheric CO, concentrations 
stabilized around 550 p.p.m. — double the pre-industrial level — which 
would probably limit warming to a little less than 3 °C. But as political 
interest in the 2 °C target grew, a few started exploring the implica- 
tions. In April 2009, a team led by Myles Allen, a climate scientist at the 
University of Oxford, UK, published’ a study concluding that humans 
would have to limit their total cumulative carbon emissions to 1 trillion 
tonnes — more than half of which had already been dumped into the 


“It’s just simple 
arithmetic: the 
carbon budget is so 
small that you need 
to go negative.” 


FEATURE | NEWS 


atmosphere — to maintain a chance of limiting warming to 2°C. This 
trillion-tonne carbon budget provided a scientific baseline for what was 
now a politically important target, and many modellers shifted gears. 

“There were very few scenarios with stringent targets such as 2°C, 
and then sponsors started demanding it,” says Massimo Tavoni, deputy 
coordinator of climate-change programmes at the Eni Enrico Mattei 
Foundation in Milan, Italy. 

The flurry of modelling efforts that followed split into two main camps: 
pay early or pay late (see “Iwo paths to 2 °C’). In the former, nations need 
to slash greenhouse-gas emissions immediately; in the latter, they can 
buy time for a slower phase-out by developing a massive infrastructure 
to suck CO, out of the air. 

“Models that have these negative emissions really do let you continue 
to party on now, because you have these options later,’ says John Reilly, 
co-director of the Joint Program on the Science and Policy of Global 
Change at the Massachusetts Institute of Technol- 
ogy (MIT) in Cambridge. 

In the pay-later approach, most models rely on 
a combination of bioenergy and CCS. The sys- 
tem starts with planting crops that are harvested 
and either processed to make biofuels or burnt 
to generate electricity, which provide carbon- 
neutral power because the plants absorb CO, as 
they grow. The CO, created when the plants are 
processed is captured and pumped underground, 
and the process as a whole eats up more emis- 
sions than it creates. A consortium sponsored by the US Department of 
Energy has tested such a system at one facility that produces bioethanol 
fuel in Illinois, but neither bioenergy nor CCS has been demonstrated 
on anywhere near the scales imagined by the models. 

“Tt’s just simple arithmetic: the carbon budget is so small that you 
need to go negative, or at least you need to offset some of your emissions 
in order to get to zero,’ says Tavoni. “We tried to be honest, and pretty 
agnostic about whether these transformations are easily achievable.” 

On the basis of those models and other information, the IPCC 
estimates that climate mitigation would reduce the projected global 
consumption in 2100 by 3-11% —a relatively modest amount that 
would allow the global economy to keep growing overall. But remove 
either bioenergy or CCS from the scenarios and the costs increase 
substantially. If mitigation is delayed or bioenergy and CCS are 
constrained, most models simply can’t limit warming to 2°C. 

The question is whether any of those models accurately reflect tech- 
nical and social challenges. MIT has a model that tends to project costs 
two or three times the average reported by the IPCC, in part because it 
tries to reflect difficulties in scaling up any technology, such as the avail- 
ability of skilled labour and natural resources in different regions. And 
then there are the technical hurdles. Capturing CO, from power plants 
has proved more difficult and expensive than many had hoped. Just one 
commercial project is currently operating, at the Boundary Dam Power 
Station in Saskatchewan, Canada. 

Moreover, Reilly says, the number of models that actually completed 
2°C scenarios remains relatively small, and they probably project lower 
mitigation costs than those that are not able to generate these low- 
emissions scenarios. “It's a very self-selecting set of models.” 

Although the caveats are listed in the IPCC assessment, the report does 
not adequately highlight economic and technical challenges or modelling 
uncertainties, says David Victor, a political scientist at the University of 
California, San Diego, who participated in the IPCC assessment. Victor 
does not place all the blame on scientists glossing over the problems: 
when researchers drafted the assessment's chapter on emissions scenarios 
and costs, he says, they included clear statements about the difficulty of 
achieving the 2°C goal. But the governments — led by the EU anda bloc 
of developing countries — pushed for a more optimistic assessment in 
the final IPCC report. “We got a lot of pushback, and the text basically 
got mangled,’ Victor says. 

For all of the concerns and criticisms, however, modellers say that the 


26 NOVEMBER 2015 | VOL 527 | NATURE | 437 


© 2015 Macmillan Publishers Limited. All rights reserved 


Two paths to 2 °C 


Staying positive Global CO, emissions 
drop immediately. 
° re 
S 30 
> 
2 20 
53° 10 
LO 
ae 
Ee 0 
45 
= -10 
2 
5 -20 
-30 + 
0 Solar power provides two-thirds 
600 of all energy by 2100. | 
500 


Modellers have explored various scenarios for limiting global warming to 2 °C. One (left) immediately slashes fossil-fuel use 
while ramping up renewable-energy use. Another strategy (right) allows continued use of fossil fuels, but bioenergy supplies 
a growing share of energy. Carbon from the bioenergy industry is captured and stored, driving overall emissions below zero. 


Negative campaign 
40 


Expansion of forests is 
required if emissions 
are to drop soon. 


Emissions 
(billion tonnes CO, per year) 


Carbon capture and 
storage efforts lock away 


-30 + more than 30 billion tt 
tonnes of CO, annually. 

600 

500 


Energy supply 
(exajoules per year) 
Ww 
fo) 
3S 


O10 _'0vovCOE>0 


2000 2020 2040 2060 2080 


2100 


Energy supply 
(exajoules per year) 
Ww 
ro) 
fo} 


2000 


2020 


2040 2060 2080 2100 


= Fossil = Biomass = Non-biomass renewables = Nuclear 


exercises have illuminated important research questions, such as how 
much bioenergy and CCS will cost and what effects they will have on 
land use, food systems and water availability. 

One 2014 study” in Earth’s Future, for instance, found that it would be 
difficult to grow enough bioenergy crops, even with second-generation 
cellulosic biofuels, which are made not only from a plant’s sugars but 
also from the carbon in its stem and woody materials. The effort would 
require significant boosts in crop yields and the use of 77% more nitro- 
gen fertilizer by 2100. The bioenergy would also need to be produced 
in centralized facilities that capture the bulk of the emissions. Unless 
everything goes right, scaling up to the level projected in many models 
would be difficult without significantly reducing food production or 
clearing large swathes of natural ecosystems for farmland. 

“If we need to ramp up such a large infrastructure, we need to inves- 
tigate what that implies,” says Sabine Fuss, an environmental scientist 
at the Mercator Research Institute on Global Commons and Climate 
Change in Berlin. 

Fuss led a commentary” in Nature Climate Change in October 2014 call- 
ing for a transdisciplinary research agenda on negative emissions. One of 
the first outgrowths of that work, led by co-author Peter Smith, a biologist 
at the University of Aberdeen, UK, is an upcoming assessment of carbon- 
negative strategies and potential limitations. Strategies include bioenergy 
with CCS, as well as other ways of absorbing carbon, such as planting 
forests, using chemical scrubbers to capture CO, directly from the air and 
crushing rocks to enhance geological weathering that consumes the gas. 

“The science behind these technologies is probably a bit behind the 
models,” Smith says. “This sort of provides a road map for where we 
need to go in the next two or three years.” 


RISK FACTOR 

Modellers are also digging into real-world complexities. Most models 
assume that participation in climate mitigation will be global, that coun- 
tries will put a common price on carbon, that technological solutions 
will be widely available and that this combination will drive investment 
towards relatively cheap mitigation options in developing nations. But 
the reality could be more complicated. A team at the Joint Global Change 
Research Institute worked with Victor and others to investigate the risks 
of making investments in developing countries due to political instability 


438 | NATURE | VOL 527 | 26 NOVEMBER 2015 


and the relatively poor quality of many public institutions there. Their 
model showed’ that investors would probably shun developing countries 
and pour money into developed ones, driving up costs and making it 
harder to curb rapidly rising emissions in developing nations. 

“The models have taught us that with unrealistic assumptions any- 
thing is possible, and with realistic assumptions it will be very hard to cut 
emissions to meet goals like 2 degrees,” Victor says. “That’s an important 
result because it forces — or should force — some sobriety about what 
can be achieved” 

One message that modellers have delivered quite clearly is that with- 
out collective and aggressive action by all countries, costs invariably 
increase, and the chance of hitting the 2°C goal plummets. This is pre- 
cisely the situation heading into the Paris summit. Most countries, and 
all of the major greenhouse-gas emitters, have submitted pledges to 
reduce their emissions, but these vary widely in ambition. 

As it stands, the world is on a path to nearly 3°C of warming by the end 
of the century, and even that assumes substantial emissions reductions 
in the future. If nations do not go beyond their Paris pledges, the world 
could be on track to use up its 2°C carbon budget as early as 2032. If the 
models are correct, world leaders may have to either accept extra warming 
or plan for a Herculean negative-emissions campaign. In the event that 
they choose the latter — and succeed — the entire debate will change. 

“It’s a completely different game,” says Nebojsa Nakicenovic, an 
economic modeller and deputy director-general of the International 
Institute for Applied Systems Analysis in Laxenburg, Austria. “If that is 
technically possible, then we could also go below 2 degrees.” 

Fast-forward to 2100 once more. The bioenergy industry is now one 
of the largest and most powerful on Earth. People are pulling roughly 
as much CO, out of the atmosphere as they were emitting at the time of 
the historic Paris conference. Humanity has asserted control over the 
atmosphere, and governments face a new and difficult question at the 
108th anniversary of the UN climate convention: how low should they 
set the global thermostat? m 


Jeff Tollefson writes for Nature from New York. 


1. Allen, M. R. et a/. Nature 458, 1163-1166 (2009). 

2. Kato, E. & Yamagata, Y. Earth’s Future 2, 421-439 (2014). 
3. Fuss, S. et al. Nature Clim. Change 4, 850-853 (2014). 

4. lyer, G.C. et al. Nature Clim. Change 5, 436-440 (2015). 


© 2015 Macmillan Publishers Limited. All rights reserved 


SOURCE: IIASA/IPCC 


OMMENT 


Share data 
on corrosion to avoid 
infrastructure disasters p.441 


that raised the alarm 
about our finite planet p.443 


Five books 


fog p.445 


Two books take a dim 
view — through twilight and 


meeting raises concerns 


Gene-editing 
about disability rights p.446 


ILLUSTRATION BY DAVID PARKINS 


> es 


After the talks 


The real business of decarbonization begins after an agreement is signed at 
the Paris climate conference, argue David G. Victor and James P. Leape. 


fter years of failure to craft global 
Azenen on climate change, the 
upcoming United Nations Paris 
Climate Conference is likely to turn a cor- 
ner. Diplomats have drafted a workable text 
that will probably be adopted. Businesses 
and environmental groups are engaged in 
the process in unprecedented ways. 
Governments, development banks and 
foundations are raising funds to help the 
poorest countries to pay for cutting emis- 
sions and prepare for a changing climate’ — 
the main sticking point in 2009, when the 
last big climate conference, in Copenhagen, 
ended in disarray. The UN and the French 
hosts have a sophisticated agenda to bring 
all these efforts together. Even religious lead- 
ers have spoken mightily of the dangers of 


unchecked climate change. 

Good news from the Paris meetings will 
build confidence, a crucial ingredient for 
effective international cooperation. Govern- 
ments and firms will invest in a future with 
lower emissions if they think that others will 
do the same’. Agreement will demonstrate 
the viability of a new, flexible ‘bottom-up’ 
mode for climate diplomacy — based on 
national pledges that accommodate different 
preferences and capabilities. By contrast, the 
rigid targets and timetables of the Kyoto Pro- 
tocol appealed to few of the world’s emitters. 


| PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


i 
rT 


26 


© 2015 Macmillan Publishers Limited. All rights reserved 


Yet a dose of sobriety is also needed. 
Agreements are feasible now only because 
diplomats are postponing the thorniest prob- 
lems, such as how to hold nations account- 
able. Business engagement may prove 
ephemeral when the spotlight shifts. Good 
news about climate finance is possible now 
because the blend of public funding (which 
is hard to mobilize and spend effectively) and 
private money (which is abundant but often 
rarely focused on global goals) is vague. 

Whether the Paris conference will succeed 
depends on what unfolds afterwards. Diplo- 
mats will have much to do until 2020, when 
the main accords take full effect. Civil soci- 
ety — notably business — must shift from 
making bold promises to cutting emissions. 
Governments and business must build 


OVEMBER 2015 | VOL 527 | NATURE | 439 


> and invest in review and accountability 
mechanisms to ensure that they are keep- 
ing their promises — an area in which non- 
governmental organizations (NGOs) have 
a crucial role. And scientists must pursue 
research that is directly relevant to policy- 
making, as well as assessing the underlying 
causes and impacts of climate change. 


ENGAGE BUSINESS 

Keeping business on board will be the most 
important challenge. It is easy for companies 
to make commitments when the world’s 
media and political leaders are watching. It 
is harder to implement changes when cut- 
throat competition makes it risky to invest in 
more expensive but less polluting technolo- 
gies and practices. 

The most striking example of business 
engagement is the pledges that many firms 
and governments are making to cut defor- 
estation’. In 2010, the Consumer Goods 
Forum (comprising the largest retailers and 
consumer-products companies) announced 
that its members would eliminate deforesta- 
tion from their supply chains, notably for 
palm oil, soya, beef, timber and pulp. More 
than 300 companies have followed suit (see 
www.supply-change.org). Leading produc- 
ers and traders of palm oil in Indonesia — 
which accounts for half of the world’s supply 
— have promised to stop converting forest or 
peat lands’. Palm oil is a main culprit in the 
fires that have spread a choking haze across 
the region since August, afflicting more than 
40 million people and often causing daily 
emissions of greenhouse gases that surpass 
those of the United States. 

It is far from assured that these pledges 
will result in lasting changes in the complex 
supply chains — from how the land is man- 
aged, to the produced oil and finally to con- 
sumer products. There are already signs of 
trouble. Most businesses pledge to become 
more sustainable following pressure from 
NGOs’. (One of us, J.P.L., led WWE Interna- 
tional for nine years, during which time the 
organization was centrally involved in many 
such efforts.) Firms fear consumer backlash 
if their products are tied to environmental 
destruction’ (see go.nature.com/518yjm). 
After the Paris meetings, chief executives 
will need to activate changes through the 
ranks of their organizations and suppliers; 
NGOs will need both to maintain the pres- 
sure for action and to work with companies 
to secure broader reforms in major produc- 
ing countries. 

Shifting whole industries into more 
sustainable modes of production requires 
collaboration between government, business 
and civil society. Economic incentives must 
be rewired so that no firm can gain an advan- 
tage by, for example, continuing to destroy 
forest. Solutions will vary by country and 
locality, but common threads include better 


440 | NATURE | VOL 527 | 26 NOVEMBE 


governance — laws, fiscal regimes, property 
rights and public administration — and 
investment in helping countries, communi- 
ties and small producers to make the transi- 
tion to sustainability. 

Brazil has shown what is possible. Between 
1995 and 2005, forest loss in the Brazilian 
Amazon averaged 19,500 square kilometres 
per year — roughly the area of Israel’. By 
2013, that rate had been cut by 70%, even 
as beef and soya production continued 
to grow. A combination of measures was 
applied: corporate commitments coupled 
with strong laws, satellite surveillance and 
robust enforcement, restrictions on access 
to credit for farms and ranches in coun- 
ties with high deforestation, the creation of 
protected areas and 


indigenous reserves, “It is easy for 
and improvements companies 

in land tenure and fomake 
governance. Brazils commitments 
federal government whenthe 
worked closely with world’s media 
the beef and soya and political 
industries, NGOs  Jegders are 


and international 
partners. In 2008, for 
example, Norway committed US$1 billion 
to Brazil because it wanted to demonstrate 
practical new ways to protect forests globally. 
Even so, Brazil’s progress is fragile — defor- 
estation in the Amazon has increased over 
the past 18 months. 

Beyond forestry, industry’s commitment 
to reducing emissions is mixed. The three 
dozen firms and governments that account 
for 40% of the methane released from oil and 
gas production have pledged to eliminate 
those emissions by 2030, for example (see 
go.nature.com/beuw2z). Details on how this 
pledge will be monitored are scarce, as is a 
plan to extend the pledge to the rest of the 
global industry. 

Business is, mostly, still waiting to see 
whether the Paris conference will turn out 
to be a watershed. Governments are looking 
for signs that industry can cut emissions at 
acceptable cost and are sceptical that com- 
peting nations will take action. For all the 
good will in Paris, this chicken-or-egg prob- 
lem looms large — it explains why climate 
policy requires international cooperation, 
and why so little progress has been made 
over the past 25 years. Governments must 
grapple with huge unknowns about what 
mitigation will cost and whether other coun- 
tries will honour their commitments’. Until 
confidence in international cooperation 
grows, politicians and business leaders will 
talk big but deliver small’. 


watching.” 


NEW DIPLOMACY 

Optimism about Paris is partly rooted ina 
new bottom-up bargaining system whose 
flexibility, in theory, is suited to crafting policy 


R 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


in areas in which cooperation is essential but 
countries are unsure about what is feasible”®. 
National pledges — in diplomatic jargon, 
‘intended nationally determined contribu- 
tions’ (INDCs) — allow governments to 
align their commitments with national pri- 
orities. This approach has elicited firm com- 
mitments — notably from countries such as 
the United States, China and India, which are 
skittish about inflexible international legal 
commitments yet willing to do their part for 
the global whole. China's pledges, for exam- 
ple, will help to slow global warming while 
serving the country’s pressing concerns about 
reducing air pollution and achieving energy 
security. 

Pledge systems also bring dangers. The 
current round of INDCs is thin on content; 
some countries have failed to supply any 
reports, and industry has been largely absent 
from the process. Unless the pledging system 
is improved, it could become a licence to do 
nothing. This is why earlier schemes have 
yielded little practical action — as with the 
Asia-Pacific Partnership on Clean Develop- 
ment and Climate created in 2005 by then- 
US President George W. Bush after the United 
States refused to ratify the Kyoto Protocol. 
Pledges must offer enough detail and trans- 
parency for diplomats to link national efforts 
into more-ambitious, collective agreements 
in the future. A priority after Paris will be to 
develop stricter standards for national pledges 
as well as robust systems for review. 


ROAD AHEAD 

Only so much can be achieved within the 
UN system — in which consensus is usually 
required and it is easy for reluctant nations 
to block progress. Countries and firms 
will need to find ways to work in smaller, 
focused and more practical groups — in 
tandem with the broader global objectives’. 
Doing this cannot rest on altruism — it 
requires attention to self-interest and, as the 
palm-oil example shows, putting pressure 
on governments and firms to rethink their 
self-interest. 

Countries that want this new flexible sys- 
tem to work should volunteer to do more 
— for example, by offering their INDCs for 
reform and review. The United States and 
China should offer their own bilateral climate 
accord, made in November last year — which 
pledged emissions curbs and efforts to con- 
duct joint research on new technologies — to 
independent scrutiny, such as by the Organi- 
sation for Economic Cooperation and Devel- 
opment or the World Bank. With a huge stake 
in showing the effectiveness of the pledging 
process, these two countries must bear the 
burden of proof*. 

Firms, too, must recognize that their 
efforts will be believed only with transpar- 
ency and public accountability. Failure to 
demonstrate that corporate pledges are 


PHIL WILLS/ALAMY 


leading to tangible action will lead to 
demands after Paris for more onerous and 
costly regulation. Industry pledges should 
be reviewed alongside government com- 
mitments — and leading firms that have 
the most to gain from this new system of 
governance should invest in the needed 
independent reviews. NGOs have a key 
role in holding companies to account, 
assessing to what degree stated reductions 
are real (with no double counting) and 
identifying where extra effort is needed. 

For academics, this world of bottom-up 
diplomacy demands new skills. Periodic 
global assessments of the state of the sci- 
ence and gaps between what governments 
and firms pledge and what the planet needs 
for protection will still be needed. Equally 
urgent is interdisciplinary research pre- 
dicting how these messy, decentralized 
systems of governance will function. Scien- 
tists, including social scientists, will need to 
look, together, at how societies develop and 
implement policy reforms while assess- 
ing what works so that research is more 
informative for policymakers. 

Sceptics will see that messy reality on 
display at the Paris conference and declare 
that the event has failed to deliver on 
widely discussed goals such as stopping 
warming at 2°C above preindustrial levels. 
The better metric is whether Paris engages 
a growing share of industry and govern- 
ments in the climate task. When the meet- 
ings in Paris are done, the real business of 
decarbonization must begin. m 


David G. Victor is professor of 
international relations at the School of 
Global Policy and Strategy, University of 
California, San Diego, California, USA. 
James P. Leape is consulting professor 

in the School of Earth, Energy and 
Environmental Sciences & Woods Institute 
for the Environment, Stanford University, 
California, USA. D.G.V. and J.P.L. are 
also on the World Economic Forum's 
Global Agenda Council on Governance for 
Sustainability. 

e-mail: david. victor@ucsd.edu; 
jleape@stanford.edu 


1. OECD. Climate Finance in 2013-14 and the 
USD 100 Billion Goal (OECD, 2015). 

2. Victor, D. G. Global Warming Gridlock 
(Cambridge Univ. Press, 2011). 

3. Hsu, A. et al. Nature Clim. Change 5, 501-503 
(2015). 

4. Carlson, K. M. & Curran, L. M. Carbon Mgmt 4, 
347-349 (2014). 

5. Potoski, M. & Prakash A. (eds) Voluntary 
Programs: A Club Theory Perspective (MIT 
Press, 2009). 

6. Overdevest, C. & Zeitlin, J. Regul. Gov. 8, 
22-48 (2014). 

7. Nepstad, D. et al. Science 344, 1118-1123 
(2014). 

8. Sabel, C. F. & Victor, D. G. Clim. Change http:// 
dx.doi.org/10.1007/s10584-015-1507-y 
(2015). 


Corrosion costs around US$4 trillion a year globally. 


Share 
corrosion data 


To prevent disasters, Xiaogang Li and 
colleagues call for open data infrastructures to 
collate information on materials failures. 


the Chinese city of Qingdao exploded, 

killing 62 people and wounding 136. 
Eight months later, a similar explosion in 
Kaohsiung caused 32 deaths and 321 inju- 
ries. The pipelines were made of steel of the 
same specification and they failed after two 
decades of use in similar environments. The 
cause was corrosion — the degradation of 
a material by a chemical or electrochemical 
reaction with its environment. 

Such disasters are common: each square 
kilometre of any Chinese city hosts more 
than 30 kilometres of buried pipes, creating 
tangled networks of oil and gas lines, water 
mains and electrical and telecommunications 
cables. Corrosion is costly, too. According to a 
US survey, corrosion costs six cents for every 
dollar of gross domestic product in the United 
States’. Globally, that amounts to more than 
US$4 trillion a year — equivalent to damages 
from 40 Hurricane Katrinas. Half of that cost 
is in corrosion prevention and control, the 
other halfin damages and lost productivity. 


E November 2013, an oil pipeline in 


A lack of knowledge hinders our ability 
to prevent failures. Degradation of under- 
ground pipes, for example, is influenced 
by the compositions, microstructures and 
designs of materials, as well as by a raft of 
environmental conditions such as soil oxy- 
gen level, humidity, salinity, pH, temperature 
and biological organisms. 

Many industries, including oil, gas, 
marine and nuclear, collect corrosion data 
to identify risks, predict the service lives of 
components and control corrosion. Most of 
these data are proprietary, and best practices 
are rarely shared. Oil spills, bridge collapses 
and other disasters continue to occur. 

Demand for knowledge about corrosion is 
growing, with the increasing use of advanced 
materials in medical devices, biosensors, fuel 
cells, batteries, solar panels and microelec- 
tronics. Corrosion is the main restriction on 
many nanotechnology applications. 

Efforts to make materials data accessi- 
ble, such as the Materials Genome Initia- 
tive (MGI), focus on ‘births rather than > 


26 NOVEMBER 2015 | VOL 527 | NATURE | 441 


© 2015 Macmillan Publishers Limited. All rights reserved 


> ‘deaths of materials. Online platforms 
for sharing corrosion data are badly needed. 
Access to a large volume and variety of cor- 
rosion information that researchers could 
probe with data mining and modelling tools 
would improve forecasts of corrosion fail- 
ures and anticorrosion designs. 


COMPLEX PROCESSES 

The biggest challenge in corrosion research 
is predicting accurately how materials will 
degrade ina given environment’. It requires 
full knowledge of all relevant factors and 
their interactions. Yet precise models for 
mechanisms are lacking. Forecasting prob- 
lems is impossible without historical data 
about materials failures under various con- 
ditions. And field performances cannot be 
judged in laboratories when environmental 
parameters are unknown. 

Corrosion data are hard to collect. Damage 
may take years or decades to accumulate and 
any project tracks only a handful of contrib- 
uting factors. Data sets need to be combined. 
For example, early studies of marine corro- 
sion (occuring, for instance, on oil-drilling 
platforms) were unreliable because they con- 
sidered only physiochemical processes (those 
involving pH, dissolved oxygen and tempera- 
ture) and not the effects of organisms living in 
seawater. The inclusion of genomic data has 
now improved the models. 

Corrosion depends on local conditions. 
Steel structures that last for decades in dry 
parts of inland China fail within months in 
humid and salty coastal areas of southeast 
Asia. Protective polymer coatings that work 
for years at northern latitudes can degrade 
in weeks near the Equator, where heat and 
greater doses of ultraviolet radiation break 
chemical bonds more quickly. Inferring 
general corrosion knowledge — such as how 
particular steels are affected by humidity, salt 
or air pollution — requires combining stud- 
ies from many diverse environments. One 
worldwide survey of weathering steel, for 
example, reviewed exposure test results for 
up to 22 years from 108 sites in 22 countries’. 

With global trade increasing, the oil and 
gas, construction, car, electronics and other 
industries have called for corrosion data to 
be shared between countries to ensure the 
quality and safety of their products. Millions 
of cars worldwide have been recalled in the 
past few years owing to unforeseen corrosion 
problems arising in destination countries. 
China's 2013 ‘Belt and Road initiative, which 
promotes industrial ties with countries along 
the Silk Road economic belt between China 
and the West, raises unprecedented chal- 
lenges. Rapid corrosion assessment, materials 
selection and design will be needed as billion- 
dollar construction, transport, energy and 
telecommunications projects begin in Asia, 
Africa and Europe. 

Advanced materials present entirely 


new corrosion problems. For example, the 
electrochemical stabilities of noble metals 
such as platinum and gold fall sharply as 
their dimensions decrease to nanometre 
scales. Corrosion of platinum nanoparticles 
remains a roadblock limiting the lifetime of 
platinum-based catalysts for fuel cells. 

Corrosion scientists have been slower 
than their materials-science peers to recog- 
nize the need for data sharing. Several large 
materials-data repositories built by US gov- 
ernmental agencies 


under the auspices of “Fe orecasting 
the MGI house basic problems 1s 
physical, chemical impossible 

and microstructure without 

data for materials, but historical data 
not corrosion data. Yet about material 


none ofthe advanced failures.” 
materials promised by 

the MGI will be practical without considering 
their environmental stability and durability. 


DATA REPOSITORIES 

Open data infrastructures should be set up 
to house corrosion data in various coun- 
tries, industries and applications. By using 
the same standardized formats for data and 
metadata, the data can be connected and 
eventually amount to a global system, pos- 
sibly linked to the MGI. 

Governments should take the lead. 
For example, the Chinese government 
has invested nearly 200 million yuan 
(US$30 million) since 2006 on a platform for 
sharing corrosion data from 30 field-testing 
stations covering standard materials in 
environments (air, soil and water) typical of 
different parts of the country. Other nations, 
industries and interest groups should estab- 
lish similar data infrastructures for corro- 
sion in other regions and sectors. 

Efforts need to be coordinated to collect 
corrosion data that are relevant to urgent 
or emerging challenges, such as alternative 
energy and nanotechnology. For instance, 
the US Department of Energy has partnered 
with the MGI to build materials data reposi- 
tories to help to speed up the development of 
alternative clean-energy sources. 

Funding agencies should incentivize the 
sharing of corrosion data about advanced 
materials and emerging technologies, 
for example, by demanding it in research 
grants and supporting the costs of publish- 
ing in open-access journals. Corrosion- 
science societies should learn from general 
materials-science societies (such as Materi- 
als Research Society, the Minerals, Metals & 
Materials Society and ASM International) 
and convene experts to establish data- 
sharing best practices and guidelines. 

Industry involvement can be encour- 
aged through partnerships with academia. 
Companies would save research and 
development costs in return for contributing 


442 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


data to repositories. Because corrosion 
concerns maintenance and safety rather than 
industrial competition, businesses should 
be willing to share such data. Data consortia 
can be formed to identify common topics of 
priority and jointly develop benchmark solu- 
tions, just as industrial standards are agreed. 

More-powerful tools need to be developed 
for data capturing, management, mining, 
modelling and simulation — the integra- 
tion of which we term corrosion big data 
and informatics*. Advanced monitoring 
technologies require ‘big data’ analytics. For 
instance, robots (known as ‘smart pigs’) car- 
rying hundreds of sensors deployed to inspect 
the walls of pipelines can collect 1 terabyte of 
data in one run. Highly accurate corrosion 
simulations could partially or completely 
replace the time-consuming, environmen- 
tally unfriendly, complicated and expensive 
experimental corrosion tests. For example, 
quantum chemical simulations are heavily 
used to evaluate the molecular structures and 
electronic properties of corrosion inhibitors’. 

If corrosion data is shared, everyone will 
benefit from the greater understanding that 
results. m 


Xiaogang Li is professor at the Key Laboratory 
for Corrosion and Protection of the Ministry 
of Education, Institute of Advanced Materials 
& Technology, University of Science and 
Technology Beijing, Beijing, China, and is at 
the Ningbo Institute of Material Technology 
& Engineering, Chinese Academy of Sciences, 
Ningbo, Zhejiang, China. Dawei Zhang, 
Zhiyong Liu, Zhong Li, Cuiwei Du and 
Chaofang Dong are at the Key Laboratory 
for Corrosion and Protection of the Ministry 
of Education, Institute of Advanced Materials 
& Technology, University of Science and 
Technology Beijing, Beijing, China. 

e-mail: dzhang@ustb.edu.cn 


1. Koch, G., Brongers, M., Thompson, N., 

Virmani, Y. & Payer, J. Corrosion Cost and 
Preventive Strategies in the United States (NACE 
International, 2002). 

2. Duquette, D. et al. Research Opportunities in 
Corrosion Science and Engineering (National 
Academies Press, 2011). 

3. Morcillo, M., Chico, B., Diaz, |., Cano, H. & de la 
Fuente, D. Corrosion Science 77, 6-24 (2013). 

4. Li, X. Informatics for Materials Corrosion and 
Protection: The Fundamentals and Applications of 
the Materials Genome Initiative in Corrosion and 
Protection (Chinese Chemical Industry Press, 
2014) (in Chinese). 

5. Taylor, C., Chandra, A., Vera, J. & Sridhar, N. 
Faraday Discussions 180, 459-477 (2015). 


CORRECTION 

The Comment article ‘Einstein was no 
lone genius’ (M. Janssen and J. Renn 
Nature 527, 298-300; 2015) wrongly 
stated the dates during which Albert 
Einstein studied at the Swiss Federal 
Polytechnical School in Zurich. He was 
there between 1896 and 1900. 


SUSTAINABILITY 


The first iconic image 
of Earth from space 

sparked awareness of 
planetary boundaries. 


The launch of 
Spaceship Earth 


Adam Rome revisits five prescient classics that first made 
sustainability a public issue in the 1960s and 1970s. 


Operating Manual for Spaceship Earth, 

the inventor and polymath Buckminster 
Fuller offered a striking metaphor for a new 
ideal of planetary management. Although 
Earth did not come with instructions, our 
spaceship had built-in safety features that 
had kept us going. Still, our pilot errors were 
catching up with us: we had been so “misus- 
ing, abusing, and polluting” the planet, Fuller 
argued, that it might need to be renamed 
“Poluto”. That way lay humanity’s oblivion. 
But if we discovered how our spaceship 
worked — if we learned to make the best 
use of our incredible ingenuity — we might 
become “comprehensively and sustainably 
successful”. 

Like everything Fuller wrote, Operating 
Manual for Spaceship Earth was idiosyn- 
cratic, at once arresting and fanciful. But 
many of the book’s basic ideas were in the 
air at the time. Between roughly 1965 and 
1975, the challenge of sustaining civiliza- 
tion inspired a shelf-full of influential books. 
They had a freshness, urgency and breadth 
that are hard to credit today, and they are still 
remarkably relevant. Now that sustainability 
as a concept has become dulled by overuse, 


IE 1969, in a book-length essay entitled 


they return our eyes to the prize. 

These seminal studies built on earlier 
fears. Fairfield Osborn’s Our Plundered 
Planet (Little, Brown) and William Vogt’s 
Road to Survival (W. Sloane Associates), 
both published in 1948, warned that uncon- 
trolled population growth and resource 
depletion would lead to calamity. But the 
situation seemed even more precarious 
by 1970, when the first Earth Day was cel- 
ebrated across the United States. The human 
impact on the planet had exploded after the 
Second World War, and scientific advances 
had led to greater understanding of the 
threat from those impacts. For the first time, 
many realized, we had the potential to dis- 
rupt or even destroy the planet’s life-support 
systems. The sense of environmental crisis 
was exacerbated by the social and political 
turmoil of the period. 

What would be required for humanity 
to continue to thrive? To tackle so huge a 
question required intellectual audacity, and 
the authors of the pioneering books on sus- 
tainability were all big-picture, interdisci- 
plinary thinkers par excellence. Economist 
Kenneth Boulding — author of The Meaning 
of the Twentieth Century (1964) — thought 


BOOKS & ARTS | COMMENT | 


The Meaning of the Twentieth Century: The 
Great Transition 

KENNETH E. BOULDING 

Harper and Row: 1964. 


Operating Manual For Spaceship Earth 
R. BUCKMINSTER FULLER 
Southern Illinois University Press: 1969. 


The Closing Circle: Nature, Man, and 
Technology 

BARRY COMMONER 

Knopf: 1971. 


The Limits to Growth: A Report for the Club 
of Rome’s Project on the Predicament of 
Mankind 

DONELLA H. MEADOWS, DENNIS L. MEADOWS, 
J@RGEN RANDERS, AND WILLIAM W. BEHRENS III 
Universe: 1972. 


Only One Earth: The Care and Maintenance 
of a Small Planet 

BARBARA WARD AND RENE DUBOS 

W. W. Norton: 1972. 


historically and philosophically. Biologist 
Barry Commoner felt compelled to study 
political economy, as his 1971 The Closing 
Circle shows. Fuller considered himself a 
futurist. The authors of the 1972 The Limits 
to Growth — Donella Meadows, Dennis 
Meadows, Jorgen Randers and William 
Behrens — meshed environmental science 
with systems analysis. Barbara Ward was a 
journalist, economist and adviser to world 
leaders who collaborated with Pulitzer-prize- 
winning microbiologist René Dubos on Only 
One Earth (1972). 

The Meaning of the Twentieth Century 
is no longer well known, yet Boulding was 
key in framing the issue of sustainability. He 
made clear that the world that he hoped to 
sustain did not yet exist: humanity was in the 
middle of a “great transition” from an agri- 
cultural species to a thoroughly industrial 
one. In Boulding’s view, this transition was 
fraught with peril and sure to be wrenching. 
It might be derailed by nuclear war or uncon- 
trolled population growth, and might fail if 
we misused natural resources, especially fos- 
sil fuels. To succeed, we needed to create “a 
stable, closed-cycle, high-level technology” 
that would not pollute or require exhaust- 
ible materials. (He expanded on that in an 
often-reprinted 1966 essay, “The economics 
of the coming spaceship Earth’) But devel- 
oping new technology was not the heart of 
Boulding’s prescription. He argued that a 
sustainable future would require countless 
“social inventions’, from new aesthetics to 
better methods of resolving disputes. “The 
unfinished tasks of the great transition are 
so enormous,’ he concluded, “that there is 
hardly anyone who cannot finda roleto > 


| PARIS CLIMATE TALKS 


A Nature special issue 
nature.com/parisclimate 


rm 


26 NOVEMBER 2015 | VOL 527 | NATURE | 443 


© 2015 Macmillan Publishers Limited. All rights reserved 


NASA 


BOOKS & ARTS 


Inventor Buckminster Fuller (top) approached sustainability as a design challenge; economist Barbara 
Ward (bottom) prompted the United Nations to integrate social and environmental issues. 


play in the process.” That is still ever true 
now: dealing with climate change requires a 
host of skills. 

Fuller broke new ground by defining sus- 
tainability as a design challenge. Already 
famous for inventions such as the strong, 


444 | NATURE | VOL 527 | 26 NOVEMBE 


lightweight, geodesic dome, he wrote exuber- 
antly about the need for an “industrial retool- 
ing revolution”: to achieve lasting affluence, 
we must learn to do more with less. Like Boul- 
ding, Fuller argued that we needed to treat 
fossil fuels as a short-term expedient while 


R 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


we worked out how to fashion a sustainable 
future. For a reader today, the insights of 
Fuller’s work are not enough to make up for 
the idiosyncrasies of his language and argu- 
ment. William McDonough and Michael 
Braungart’s Cradle to Cradle (North Point, 
2002) would be a much better introduction to 
sustainable design. But in 1969, Fuller’s work 
seemed thrilling, and his Operating Manual 
became a bible for people keen to invent eco- 
efficient ways of providing energy, building 
things and managing wastes. 

Commoner’s The Closing Circle laid the 
foundation for industrial ecology. Particularly 
in the postwar decades, Commoner argued, 
the industrialized world had come to rely on 
a host of “ecologically faulty” technologies, 
from nuclear power to chemical pesticides. 
The technologies of the future needed instead 
to accord with four basic principles, which 
he defined as laws of ecology: “Everything 
is connected to everything else’, “Everything 
must go somewhere’, “Nature knows best” 
and “There is no such thing as a free lunch”. 

For Commoner, however, the ultimate 
problem was economic and political, not 
technological. Discussing the economic 
meaning of ecology, he argued that the 
private-enterprise system had serious 
flaws. Businesses had powerful incentives 
to produce new products that did more 
environmental harm than the products 
they replaced. They did not need to account 
for “biological capital’, and they did not pay 
the full costs of production, which included 
pollution. In the decades since The Closing 
Circle appeared, making capitalism greener 
has become a major concern of economists, 
business-school professors, entrepreneurs, 
corporate executives and activists, yet much 
of Commoner’s critique still holds. 

The Limits to Growth asked — heretically 
— whether humans could continue indefi- 
nitely to make ever greater demands on 
Earth. The authors used computer modelling 
to explore the interactions between popula- 
tion growth, resource demand, industriali- 
zation, food production and pollution. They 
did not forecast the future, although com- 
mentators ever since have debated whether 
their ‘predictions’ were right; instead, they 
extrapolated. If present trends continued, 
the authors wrote, humanity would hit the 
wall “sometime within the next hundred 
years”. They hoped that people would avert 
a breakdown, but stated repeatedly that they 
could not model the social, political and cul- 
tural factors that might alter trends. They 
did consider whether technology could be a 
magic bullet, and the results were shocking. 
Even when they allowed for the technological 
progress that greatly increased the availability 
of resources and reduced the amount of pol- 
lution, the result was still collapse — just far- 
ther down the road. Innovation alone could 
not lead toa sustainable economy. We needed 


ANS NAMUTH/SPL 


x= 


YUTAKA NAGATA/UN PHOTO 


a fundamental shift in values. 

The Limits to Growth was an international 
sensation, selling over 12 million copies in 
more than 30 languages. Meadows, Meadows 
and Randers updated the analysis in 1993 and 
again in 2004, and the question of limits still 
prompts vigorous debate. Johan Rockstrém 
and Mattias Klum's Big World, Small Planet 
(Yale University Press, 2015) and Donald 
Worster’s Shrinking the Earth (Oxford Uni- 
versity Press, 2016) are just two of the many 
books now probing the problem of growth. 

Ward and Dubos’s Only One Earth, writ- 
ten to accompany the 1972 United Nations 
Conference on the Human Environment, 
added an international perspective to the 
sustainability discussion. Ward had travelled 
the globe as an expert on economic develop- 
ment. A preliminary draft of the book was 
circulated for comment to scientific, busi- 
ness and intellectual leaders from 58 coun- 
tries, and the result is worth reading just for 
the summary of their responses, which made 
clear that people around the world held very 
different views about environmental issues. 
A European respondent argued for a retreat 
from industrialization, for example, whereas 
an Asian statesman wrote that developing 
nations could not afford “dreams of land- 
scapes innocent of chimney stacks”. 

For Ward and Dubos, any effort to ensure 
the survival of humanity had to bridge the 
tremendous gap between developed and 
developing nations. Although they didn’t 
use the phrase ‘sustainable development; 
they offered a path-breaking analysis of the 
challenge of raising living standards for the 
poor without degrading the environment. 
At the same time, they called for the affluent 
to take off their blinkers. Well-to-do nations 
needed to acknowledge the damage that they 
were doing to the biosphere — and to accept 
that their fate was inseparable from the pros- 
pects of the rest of the world. Because many 
environmental threats were global, Ward and 
Dubos concluded, “planetary interdepend- 
ence” had to become a moral and political 
reality, not just “a hard and inescapable sci- 
entific fact”. The UN Paris Climate Change 
Conference starting this month will be a test 
of how close we are to meeting that aim. 

Read together, the books of this charged 
decade demonstrate that building a sus- 
tainable civilization is multidimensional. It 
sweeps everything in: science and technol- 
ogy, politics, economics, social relationships, 
ethics. We cannot advance ina straight line. 
We need to approach the goal from many 
directions, with flexibility and tenacity. = 


Adam Rome is a professor of history and 
English and the Unidel Helen Gouldner 
Chair for the Environment at the University 
of Delaware in Newark. His latest book is 
The Genius of Earth Day. 

e-mail: arome@udel.edu 


Books in brief 


aS The Secret of Our Success: How Culture Is Driving Human 
| OTHE Evolution, Domesticating Our Species, and Making Us Smarter 
te Joseph Henrich PRINCETON UNIVERSITY PRESS (2015) 

S SECRET The force propelling Homo sapiens down its unique evolutionary 

tom, pathway is “culture-gene coevolution”, avers anthropologist (and 

= OF QUR | aerospace engineer) Joseph Henrich. Over time, he posits, the need 

| ae | to acquire “adaptive cultural information” expanded the human 

i SUCCESS brain, and societies’ “collective brains” in turn shaped human 

eS culture. Integrating insights from cognitive psychology, experimental 
J economics, history and ethnography, this limber and lucid study 


concludes that we face a major transition into a new type of animal. 


The Last of the Light: About Twilight 

Peter Davidson REAKTION (2015) 

Cultural historian Peter Davidson enters the twilight zone, tracing 
the crepuscular in science, psychology, history and the arts. 
Considering the 60th parallel north, around which “long evenings 
and protracted sunsets stretch”, Davidson probes aspects of this 
transitional state, including visual perception during the stages of 
twilight (civil, nautical and astronomical); dusk as a metaphor for 
crisis in Charles Dickens’s Bleak House; the proliferation of gilt and 
mirrors in the murky pre-electric era; and the poet Gerard Manley 
Hopkins’ observations of anti-crepuscular rays, published in Nature. 


London Fog: The Biography 

Christine L. Corton BELKNAP (2015) 

London’s ‘pea-soupers’ — opaque, yellowish smogs — were an 
environmental catastrophe, a cloak for nefarious activities and an 
artistic inspiration. An odiferous wig of soot from coal fires, sulfur 
dioxide and mist settled regularly over the city from the 1840s to the 
1960s. In this richly nuanced history, scholar Christine Corton takes 
us from polymath Robert Hooke spotting a pall of smoke over London 
in 1676 through the killer fogs that felled zoo animals, spurred crime 
and caused traffic accidents, and that ultimately galvanized scientists 
and the government to craft the 1956 Clean Air Act. 


The Secrets of Sand: A Journey into the Amazing Microscopic 
World of Sand 

Gary Greenberg, Carol Kiely and Kate Clover VOYAGEUR (2015) 
Beachcombers take heed: the real treasure is stuck to your soles. 
Sand — as cell biologist Gary Greenberg, microscopist Carol Kiely 
and science curator Kate Clover show in this delightful coffee-table 
book — is dazzling, from star-shaped forams to egg-like ooids. To 
photograph these minuscule jewels rock-polished by wind and surf, 
Greenberg used 3D microscopes and smart lighting. A stunning 
extra are images of the lunar dust particles that Kiely studies, 
including glassy spherules from extinct fire-fountain volcanoes. 


The Best American Infographics 2015 

Gareth Cook and Maria Popova MARINER (2015) 

Another year, another superb volume in this infographics series 
edited by journalist Gareth Cook; cultural curator Maria Popova (of 
blog ‘Brain Pickings’) guest-introduces. ‘What Do Americans Speak?’ 
(Slate, 13 May 2014) offers an eye-popping map showing the third 
most commonly spoken language in each US state — in Michigan, 
that is Arabic — and Nature’s own ‘Born Here, Died There’ (Nature 
http://doi.org/8xg; 2014) explores dynamic patterns in cultural 
history through an elegant animation. Barbara Kiser 


26 NOVEMBER 2015 | VOL 527 | NATURE | 445 


>) 2015 Macmillan Publishers Limited. All rights reserved 


Correspondence 


Gene editing: heed 
disability views 


CRISPR-Cas9 is a gene- 
editing tool of great potential, 
although not necessarily from 
a disability-rights perspective 
(see D. J. H. Mathews et al. 
Nature 527, 159-161; 2015). 
People with disabilities are, in 
my view, unlikely to be queuing 
up for genetic modification: 
their priority is to combat 
discrimination and prejudice. 

To ‘fix’ a genetic variation that 
causes a rare disease may seem 
an obvious act of beneficence. 
But such intervention assumes 
that there is robust consensus 
about the boundaries between 
normal variation and disability. 
Contrary to the prevailing 
assumption, most people with 
disabilities report a quality of life 
that is equivalent to that of non- 
disabled people (G. L. Albrecht 
and P. J. Devlieger Soc. Sci. Med. 
48, 977-988; 1999). 

The UK Nuffield Council 
on Bioethics is deliberating the 
ethical and social dimensions 
of CRISPR. International 
guidelines are urgently needed 
(Nature 526, 310-311; 2015), 
and the voices of people living 
with illness and impairment 
need to be heard. 
Tom Shakespeare University of 
East Anglia, Norwich, UK. 
tom.shakespeare@uea.ac.uk 


Gene editing: govern 
ability expectations 


From a disability-rights 
viewpoint, problems that have 
dogged the debate on human 
genetic modification (see 
go.nature.com/6wb45k) also 
pervade your curtain-raiser 

to the US National Academies 
of Sciences, Engineering and 
Medicine conference (see 
D.J.H. Mathews et al. Nature 
527, 159-161; 2015). The 
authors’ portrayal of the public 
as a passive recipient of ‘wisdon’ 
from ‘experts’ goes against 
healthy discourse on responsible 


research and governance. 

The disability-rights 
community has a history 
of disagreement with such 
experts (including authorities, 
scientists and clinicians) over 
their perception of people with 
disabilities. This is summarized 
as ‘ableism’, a view that disability 
is an abnormality instead ofa 
feature of human diversity. It 
can lead to flawed ‘solutions’ and 
disempower those affected (see 
G. Wolbring J. Crit. Anim. Stud. 
12, 118-141; 2014). 

“Tt is time to collectively make 
decisions about the kind of 
world we want to live in,’ write 
Mathews and colleagues. This 
discussion should include ability 
expectations and how they 
should be governed. 

Gregor Wolbring University of 
Calgary, Alberta, Canada. 
gwolbrin@ucalgary.ca 


Gene editing: survey 
invites opinions 


As the US National Academies 
of Sciences, Engineering and 
Medicine summit on the 
regulation of CRISPR-Cas9 
gene-editing tools gets under 
way, we invite readers to 
contribute their opinions about 
this technology and its use to a 
survey at go.nature.com/eyowaf. 
Public engagement in decisions 
about applications of science 
and technology that affect 
society is essential. The summit 
is, to a degree, modelled on 
the 1975 Asilomar Conference 
on the potential biohazards of 
recombinant DNA (see Nature 
http://doi.org/899; 2015). It must 
not make the same mistake of 
being held behind closed doors. 
As one survey contributor 
remarks, it may be impossible “to 
get this [CRISPR-Cas9] genie 
back into the bottle”. So when it 
comes to wishes for the genie, 
those of both scientists and the 
public must be considered. 
Silvia Camporesi, Lara Marks 
King’s College London, UK. 
silvia. 1.camporesi@kcl.ac.uk 


446 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


Climate change also 
creates expatriates 


I visited the island of Tuvalu in 
the Pacific Ocean three decades 
ago as the environmental 
assessor for an aid-funded 
engineering consultancy. 
Pollution of the freshwater lens 
and scavenging of protective 
shoreline coral rubble for 
construction were problems even 
then. As you note (see Nature 
526, 624-627; 2015), these may 
drive exodus sooner than rising 
sea levels. 

Nobody likes to be forced out 
of their home. But small oceanic 
nations hold a valuable asset: 
sovereignty. Tuvalu already 
profits from its own Internet 
domain (.tv), and sovereign 
nations have United Nations 
votes, which are effectively on 
the market. They can operate 
attractive tax regimes. They can 
declare marine reserves and 
sell rights to fisheries, seabed 
mining or reef tourism. All of 
these make money, and it does 
not have to be divided between 
many people. They can all be 
done even if nobody lives there 
in person. Citizens of such 
small island nations could thus 
become well-off expatriates, as 
well as refugees. 

Ralf Buckley Griffith University, 
Gold Coast, Australia. 
r.buckley@griffith.edu.au 


Crowdfunding not 
fit for clinical trials 


Crowdfunding can raise money 
quickly and with minimal 
bureaucracy. But it should not be 
considered as a way to finance 
clinical trials because of potential 
ethical implications. 

One problem is that funding 
recipients are not accountable to 
the public because crowdfunding 
is unregulated. Another is that 
there is no setting of research 
priorities, so crowdfunded 
clinical trials may not be the 
most important or widely 
applicable ones. And media 


tactics could attract emotional 
donations, for example by 
generating false expectations of a 
‘cure. Moreover, an inconclusive 
or negative outcome could erode 
public trust. 

By contrast, the mainstream 
funding process for clinical 
trials takes into account disease 
prevalence, morbidity and 
mortality, justice and utility. 
Crowdfunding for clinical trials 
should be similarly regulated to 
mitigate its potential risks. 
Phaik Yeong Cheah University 
of Oxford, UK. 
phaikyeong@tropmedres.ac 


Lessons from EPA on 
tracking pollutants 


In our opinion, China could 
learn from the success of the 

US Environmental Protection 
Agency (EPA) in providing 
open-access environmental 
information to the public. This 
would enhance the credibility of 
government decisions. 

The EPA’ Toxics Release 
Inventory programme, in 
partnership with state agencies, 
collects data from enterprises 
that must report emissions. It 
subjects this information to 
quality-assurance reviews, trends 
analysis and error correction, 
as well as making it publicly 
available. This evaluation of the 
entire information-flow process 
increases transparency and 
accountability. 

Using a comparable holistic 
approach, China’s Ministry of 
Environmental Protection could 
develop a secure access point 
for ministries and agencies and 
a web portal for public access. 

A designated group might set 
quality standards and policies 
for handling such information, 
akin to the EPA’s Office of 
Environmental Information. 
Bo Zhang Information Center, 
Ministry of Environmental 
Protection, Beijing, China. 
Wayne S. Davis EPA, 
Washington DC, USA. 
zhangbo@mep.gov.cn 


NEWS & VIEWS 


For News & Views online, go to 
nature.com/newsandviews 


Acorn worms in a nutshell 


The genome sequences of two members of the hemichordate group of marine invertebrates bring the evolution of their 
relatives, including vertebrates, into sharper focus. SEE ARTICLE P.459 


CASEY W. DUNN 


y examining the similarities and 
B differences among the genomes of 

living organisms, we can reconstruct 
features of the genomes of long-dead ances- 
tors. Such reconstructions provide insight 
into patterns of genome diversity and how 
organisms evolved through the gain, loss and 
modification of genomic features. The greater 
the number of sequenced genomes from living 
organisms, and the broader their distribution 
across the tree of life, the better is our view of 
these ancestral genomes. However, although 
hundreds of animal genomes have been pub- 
lished in recent decades, the vast majority are 
from only two groups: vertebrates and arthro- 
pods. Simakov and colleagues’ publication’ 
in this issue (page 459) of genome sequences 
for two species from a group of invertebrates 
known as hemichordates takes the sampling 
of animal genomes an important step forward. 

Hemichordates are exclusively marine 
animals. The adults live on the ocean bottom, 
whereas the larvae are free-swimming. There 
are about 130 described species’, which are 
divided into 2 groups. The pterobranchs, of 
which there are around 20 species, are small 
animals (up to about 5 millimetres long) that 
form colonies of asexually produced clones 
attached to a central disk by fleshy tethers’. 
The animals live in a tube network that they 
secrete. Just as birds were found to be ‘living 
dinosaurs’ — a group that had been thought 
extinct — pterobranchs are living graptolites, 
animals that are abundant in the fossil record*. 
In contrast to pterobranchs, the other group of 
hemichordates, called enteropneusts, are soli- 
tary animals that range in length from less than 
a millimetre’ to more than 2 metres (Fig. 1). 
Known as acorn worms, enteropneust adults 
burrow in soft sediments. 

Simakov and colleagues present the genome 
sequences of two enteropneusts — Saccoglossus 
kowalevskii and Ptychodera flava. The authors 
used these sequences, together with additional 
DNA sequence data on pterobranchs and sev- 
eral other animals, to build a phylogenetic tree 
that finds pterobranchs and enteropneusts to 
be sister groups (Fig. 2). This finding is in 
agreement with another analysis’ in reject- 
ing the previously suggested placement of 


Gill pores 


Figure 1 | Pharyngeal gill slits. Enteropneusts, better known as acorn worms, use internal gill slits in the 
pharynx region of their trunk to move water through their mouth to obtain oxygen and, in some species, 
for filter feeding. The gill slits connect to external gill pores. This specimen is several centimetres long. 


pterobranchs within enteropneusts. 

As interesting as hemichordates are in 
their own right’, much of the motivation for 
taking a closer look at them comes from a 
desire to understand their relatives. This is 
because hemichordates fall within Deutero- 
stomia, the group of animals that also includes 
echinoderms (radially symmetrical organ- 
isms such as sea stars and sea urchins) and 
chordates. Chordates are of particular interest 
because they include humans and our verte- 
brate kin. Although many chordate genome 
sequences are available, there are few genome 
resources for other deuterostomes. A draft 
genome for a sea urchin is available’, but until 
now there were no published genomes for 
hemichordates. 

The most recent common ancestor of 
deuterostomes lived more than 500 mil- 
lion years ago, and there is great diversity in 
the anatomy of adults of this group. However, 
many features of deuterostome embryology, 
including the formation of the anus from 
the blastopore and the creation of coelomic 


448 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


cavities by pinching off from the gut, are highly 
evolutionarily conserved. The main finding of 
Simakov and colleagues’ study is that deuter- 
ostome genomes, like their embryology, show 
extensive conservation across great evolution- 
ary timescales. The hemichordate sequences 
share many features with other deuterostome 
genomes, including gene composition, exon- 
intron structure and small- and large-scale 
gene order. This means that many well-char- 
acterized features of chordate genomes are not 
chordate-specific, but arose earlier in animal 
evolution. 

One of the most conspicuous deuterostome- 
specific traits is the pharyngeal gill slits. These 
openings allow water to pass through the 
mouth without entering the digestive tract, 
and they are involved in feeding and respira- 
tion in these animals. Gill slits arose in the stem 
lineage that gave rise to deuterostomes, and are 
not found in non-deuterostome animals, nor 
in echinoderms, in which they were secondar- 
ily lost (Fig. 2). A detailed understanding of 
the evolutionary origin of this feature is key 


CASEY W. DUNN 


Hemichordata 


Deuterostomia 
S 
'e 


Gill-slit gain 


eo: 
Gill-slit loss 


eo 
Chordata Cephalochordata 
% 
e 


— 


Echinodermata 


Craniata 


Urochordata 


Figure 2 | Deuterostome relationships. The deuterostome group can be divided into three clades: 
chordates (Cephalochordata, or lancelets; Craniata, which includes all vertebrates; and Urochordata, such 
as sea squirts); echinoderms (sea stars, sea urchins and relatives); and hemichordates (pterobranchs and 
enteropneusts). Simakov et al.' present the first hemichordate genome sequences, from two enteropneust 
species. The authors’ analyses provide new detail on evolutionarily conserved genes that play a part in the 
development of gill slits. These structures arose along the deuterostome stem, were lost in echinoderms 
and are reduced in the adults of some chordates (including humans). 


to understanding deuterostome, and there- 
fore our own, biology. Simakov et al. examine 
a conserved cluster of six genes that is found 
only in deuterostomes, and that includes genes 
known to be involved in patterning gill slits in 
other deuterostome species. In keeping with 
the other conserved genome features that they 
identified, the authors find that these genes are 
also expressed in the pharyngeal-gill structure 
of hemichordates. 

Simakov and colleagues recognize that there 
is no ‘typical’ representative of any animal 
group; they sequenced two full hemichordate 
genomes, and collected less-detailed sequence 
data from a variety of other species to put these 
genomes in a richer evolutionary context. 
However, the authors’ study still faces the same 
challenge as all genome investigations — we 


CIRCADIAN CLOCKS 


are far from understanding which evolutionary 
changes in genomes underlie which evolution- 
ary changes in traits, including development, 
anatomy and functional biology’. 

There are several reasons for this. First, 
many evolutionary genome changes are 
neutral’ — they have no impact on traits or 
fitness. This means that we should not assume 
that any particular genome change affects 
any traits. Second, on any given phylogenetic 
branch there will be many changes in both 
traits and genomes, and there are many pos- 
sible functional implications for any particu- 
lar genome change. Third, genome function 
itself evolves, so the same genome features 
do not necessarily relate to the same traits in 
different species. 

Our current coarse perspective on genome 


A receptor for subtle 
temperature changes 


The protein IR25a is best known for its role as an odour receptor in flies, but an 
analysis reveals that it also acts to synchronize the circadian clock by sensing 
small temperature fluctuations. SEE LETTER P.516 


FRANCOIS ROUYER 
& ABHISHEK CHATTERJEE 


ur body’s circadian clocks sense 
the environmental changes that 
occur over 24 hours, allowing us 
to adapt our physiology and behaviour to 
day-night cycles. Light and temperature have 
by far the greatest influence on the clock that 
drives rest-activity rhythms, but how the 
neurons of this clock synchronize to tempera- 
ture in the brain remains largely unknown. 


On page 516 of this issue, Chen et al.’ identify 
a receptor protein in mechanosensory organs 
in flies that acts as a specialized temperature 
sensor, synchronizing the circadian clock with 
low-amplitude temperature cycles. 

Small daily fluctuations of just 1-2 °C are 
enough to synchronize the fly brain’s circadian 
clock with temperature’ (a process known as 
temperature entrainment). Experiments 
using cultures of different fly body parts have 
revealed that most organs can entrain their 
clocks with temperature cycles. The exception 


NEWS & VIEWS | RESEARCH | 


evolution will improve as more genomes are 
sequenced, and as functional genomic tools are 
developed that can be applied to any organ- 
ism, not just those that can be grown in the 
laboratory. The conservation of so many fea- 
tures across deuterostome genomes, which is 
brought into sharp focus with Simakov and 
colleagues’ addition of hemichordate genome 
sequences, reinforces the fact that radical 
morphological changes are not necessarily 
related to radical changes in genomes. This 
fact will shape the search for which of the vari- 
able features of deuterostome genomes are 
responsible for the great diversity we see across 
the group. = 


Casey W. Dunn is in the Department of 
Ecology and Evolutionary Biology, 
Brown University, Providence, 

Rhode Island 02912, USA. 

e-mail: casey_dunn@brown.edu 


1. Simakov, O. et al. Nature 527, 459-465 (2015). 

2. Kaul-Strehlow, S. & Rottinger, E. in Evolutionary 
Developmental Biology of Invertebrates 6: 
Deuterostomia (ed. Wanninger, A.) 59-89; http:// 
dx.doi.org/10.1007/978-3-7091-1856-6_2 
(Springer, 2015). 

3. Lester, S. M. Mar. Biol. 85, 263-268 (1985). 

4. Mitchell, C. E., Melchin, M. J., Cameron, C. B. & 
Maletz, J. Lethaia 46, 34-56 (2013). 

5. Worsaae, K., Sterrer, W., Kaul-Strehlow, S., 
Hay-Schmicdt, A. & Giribet, G. PLoS ONE 7, e48529 
(2012). 

6. Cannon, J. T. et al. Curr. Biol. 24, 2827-2832 (2014). 

7. Sea Urchin Genome Sequencing Consortium et al. 
Science 314, 941-952 (2006). 

8. Dunn, C. W. & Ryan, J. F. Curr Opin. Genet. Dev. 35, 
25-32 (2015). 

9. Lynch, M. The Origins of Genome Architecture 
(Sinauer, 2007). 


This article was published online on 18 November 2015. 


is the brain, which must thus rely on external 
sensors’. Expression of the nocte gene is 
required both for the normal development of 
mechanosensory structures called chordotonal 
organs and for temperature entrainment of the 
brain clock’. This suggests that chordotonal 
organs, which are present in the antennae and 
body parts such as legs and wing hinges, are 
the external sensors. Although the antennae 
are the major temperature-sensing organ*”, 
they are not essential for entrainment, indi- 
cating that the body chordotonal organs can 
do the job. 

To analyse the role of the body chordotonal 
organs in temperature entrainment of the 
brain's rest-activity clock, Chen et al. looked 
for proteins that interact with the Nocte pro- 
tein. Among the putative Nocte-binding 
partners that they identified was the protein 
Ionotropic Receptor 25a (IR25a). IR25a is part 
of the IR family, members of which are found 
in sensory organs and are involved in detecting 
chemicals°. So far, IR25a has been best known 
for its role as a component of a multimeric 
odour receptor in antennae’. As expected for 
a Nocte partner, the authors found that IR25a 


26 NOVEMBER 2015 | VOL 527 | NATURE | 449 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


was present in the sensory neurons of 


Wild type No IR25a 


One candidate is the ion-channel 


chordotonal organs. 

Chen and colleagues investigated 
whether flies lacking IR25a could syn- 
chronize their rest-activity rhythms 
to temperature changes using several 
different entrainment protocols. For 


a 


Dark 


Activity 


Light 


protein Pyrexia, which is found in 
the chordotonal organs and seems 
to be needed for entrainment in low- 
temperature conditions’’. Another 
channel, TrpA1, has been implicated 
in the control of the fly clock by tem- 


example, they entrained flies using 


perature". However, a consensus is 


regular 12-hour intervals of light and b 
dark, and then kept the entrained ani- 
mals in the dark while applying anew 
regime of large-amplitude tempera- 
ture cycles (between 16 and 25°C or 
20 and 29°C), 7 hours ahead of the old 


: 


Temperature 


emerging for a role for TrpA1 in the 
temperature-dependent regulation 
of afternoon siestas, rather than in 
entrainment’? ™*. 

The neuronal circuits that pass 
temperature information from the 


light-dark cycle. Both wild-type flies c 
and IR25a mutants quickly reset their 
clock, becoming most active at the 
end of the warm phase, whereas nocte 
mutants did not. However, when the 
new regime involved low-amplitude 


chordotonal organs to the brain 
also remain unknown. The complex 
changes in protein oscillations caused 
by loss of IR25a point to a role for 
several groups of neurons. But 
whether different temperatures or 


temperature fluctuations (18-20°C, 


amplitudes of change target specific 


21-23°C and 25-27 °C), the IR25a 
mutants failed to adapt (Fig. 1). IR25a 
is thus required for synchronizing the 
circadian system with low-amplitude 
temperature changes, apparently 


D1 


AA B.. 


subsets of clock neuron is an open 
question. In addition, Chen and 
colleagues’ work suggests that the 
clock network responds differently 
to temperature in constant light 


independently of the temperature 
range. The authors also provide data 
to show that IR25a acts in the body 
chordotonal organs, rather than the 
antennae. 

If IR25a acts as a temperature 
sensor in sensory neurons upstream 
of the rest-activity clock, its loss 
should prevent synchronization 
with temperature not only of behav- 
iour, but also of the oscillations 
of the proteins that make up the 
clock itself. The rest-activity clock 
involves protein oscillations in about 
150 neurons, from around 6 subsets. 
Chen et al. observed complex defects 
in clock-protein oscillations during shallow 
temperature cycles in IR25a mutants, whereas 
oscillations were normal during light-dark 
cycles. 

Most clock-neuron subsets were affected, 
but the authors observed intriguing differences 
when the temperature cycles were applied in 
constant light or darkness. Notably, some of the 
proteins in some groups, such as ventral lateral 
neurons (LNvs), stopped cycling in darkness, 
whereas others, such as DN2 dorsal neurons, 
stopped in light. How temperature informa- 
tion travels in the clock network remains to 
be discovered, but the authors revealed a key 
role for DN groups — blocking their neuronal 
output with tetanus toxin prevented behav- 
ioural synchronization to shallow temperature 
cycles. Interestingly, this finding sits well with 
previous observations that DN2 neurons are 
involved in the temperature entrainment of 
the larval clock*®, and temperature preference 
rhythms in the adult’. 

Finally, Chen et al. demonstrated the role 
of IR25a in neuronal responses to small 


Time Time 


Figure 1 | Telling time with temperature. Chen et al.’ investigated 
whether flies lacking the protein IR25a can respond to temperature 
changes to reset the circadian clock in their brain that controls 
rest-activity rhythms. a, The authors synchronized the flies’ clock to 
12-hour light-dark cycles, in which activity peaked twice during light 
hours. The flies were then kept in constant darkness and exposed to 
different temperature cycles. b, At constant temperature, the insects’ 
activity changed, but was still synchronized to the original light-dark 
cycle. c, When exposed to large temperature cycles (variations of 9 °C) 
that fluctuated 7 hours ahead of the light-dark cycle, the clock was 
reset in both types of fly, so that the activity peak occurred at the end 
of the warm period. d, When exposed to low-amplitude fluctuations 
(2°C), flies lacking IR25a could not respond, demonstrating that IR25a 
mediates the clock’s synchronization with small temperature changes. 


temperature changes. Whereas sensory neu- 
rons in the legs of both wild-type and IR25a- 
mutant flies responded to movements of the 
leg, only wild-type neurons were activated by 
temperature changes. Moreover, when IR25a 
was misexpressed in a population of large 
LNv clock neurons, their activity increased in 
response to small temperature fluctuations. 
Thus, the presence of IR25A is sufficient to 
induce neuronal responses to temperature, 
even in the absence of its partners in the multi- 
meric olfactory receptor. 

Chen and colleagues’ study deftly 
demonstrates the temperature sensitivity of 
the rest-activity clock, and suggests that the 
chordotonal organs are key players in the 
clock’s temperature entrainment. However, 
the inability of nocte mutants to respond 
to large temperature cycles indicates that 
the chordotonal organs also mediate the 
response of circadian rhythms to larger tem- 
perature changes. Whether IR25a might play 
a part here is unclear, but the process clearly 
involves other temperature sensors as well. 


450 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


or darkness, suggesting that light 
inputs strongly affect temperature 
entrainment. 

Although light can entrain the 
brain clock through both internal 
and external sensors, temperature 
entrainment seems to rely on external 
sensors. The benefit of preventing the 
brain clock from directly sensing tem- 
perature is unclear. In mammals, the 
rest-activity clock, which is located in 
a brain region called the suprachias- 
matic nuclei, also controls body-tem- 
perature rhythms and uses them to 
synchronize peripheral clocks'*’* — 
it therefore makes sense that these 
nuclei contain temperature-resistant neuronal 
networks. But flies do not regulate their body 
temperature, so why is their brain clock tem- 
perature-resistant? Perhaps this organization 
prevents any overly strong effects of tempera- 
ture in favour of light, whose daily oscillation 
might be more reliable. Perhaps it mediates 
a balance between light and temperature, 
allowing one entrainment circuit to influence 
the other. Deciphering how the two sensory 
modalities are integrated by the clock neuronal 
network will be an exciting challenge for the 
next few years. m 


Francois Rouyer and Abhishek Chatterjee 
are at the Institut des Neurosciences 
Paris-Saclay, Université Paris-Sud, CNRS, 
Université Paris-Saclay, 91190 Gif-sur-Yvette, 
France. 

e-mail : francois.rouyer@inaf.cnrs-gif.fr 


1. Chen, C. et al. Nature 527, 516-520 (2015). 

2. Wheeler, D. A., Hamblen-Coyle, M. J., Dushay, M. S. 
& Hall, J. C. J. Biol. Rhythms 8, 67-94 (1993). 

3. Sehadova, H. et al. Neuron 64, 251-266 (2009). 


4. Frank, D. D., Jouandet, G. C., Kearney, P. J., 
Macpherson, L. J. & Gallio, M. Nature 519, 358-361 
(2015). 

5. Liu, W. W., Mazor, O. & Wilson, R. |. Nature 519, 
353-357 (2015). 

6. Benton, R., Vannice, K. S., Gomez-Diaz, C. & 
Vosshall, L. B. Cel! 136, 149-162 (2009). 

7. Abuin, L. et al. Neuron 69, 44-60 (2011) 

8. Picot, M., Klarsfeld, A., Chélot, E., Malpel, S. & 
Rouyer, F. J. Neurosci, 29, 8312--8320 (2009). 


IMAGING TECHNIQUES 


9. Kaneko, H. et al. Curr. Biol. 22, 1851--1857 
(2012). 

10.Wolfgang, W., Simoni, A., Gentile, C. & Stanewsky, R. 
Proc. R. Soc. B 280, 20130959 (2013). 

11.Lee, Y. & Montell, C. J. Neurosci. 33, 6716-6725 
(2013). 

12.Das, A., Holmes, T. C. & Sheeba, V. PLoS ONE 10, 
e0134213 (2015). 

13.Green, E. W. et al. Proc. Natl Acad. Sci. USA 112, 
8702-8707 (2015). 


Super-resolution 


ultrasound 


By infusing blood vessels with gas-filled microbubbles and using rapid 
ultrasound imaging to detect the bubbles, super-resolution imaging of 
an entire vessel system has been achieved in a rat brain. SEE LETTER P.499 


BEN COX & PAUL BEARD 


Itrasound imaging is used in hospitals 

throughout the world as a safe, non- 

invasive and relatively inexpensive 
way to visualize a patient’s internal tissues in 
real time. The quality of ultrasound images has 
been improving steadily since the 1970s, owing 
to advances in hardware and image-forming 
algorithms. But, like all wave-based imaging 
techniques, ultrasound faces a fundamental 
limit because of the way that waves spread out 
(diffract) as they travel — two objects are dis- 
tinguishable from one another only if they are 
more than half a wavelength apart. On page 499 
of this issue, Errico et al.! overcome this limit to 
produce super-resolution images of the micro- 
vasculature in the brain ofa live rat. 

If the resolution limit of ultrasound imag- 
ing is related to wavelength, why not just use 
sound with a shorter wavelength? Although 
this approach is useful to some extent, the 
absorption of ultrasound waves increases 


Ultrasound scanner 


strongly as the wavelength decreases; therefore, 
using shorter wavelengths limits the depth to 
which tissue can be imaged before the reflected 
waves are attenuated too much to be detected. 
As a result, the resolution limit in clinical 
ultrasound imaging is at best hundreds of 
micrometres. To take useful images at depth, 
it is therefore necessary to bypass the half- 
wavelength limit. 

The same fundamental resolution limit is 
found in light microscopy. But the develop- 
ment of several super-resolution techniques, 
such as photoactivated localization micro- 
scopy (PALM), has enabled researchers to 
achieve nanoscale resolution — breakthroughs 
for which the 2014 Nobel Prize in Chemistry 
was awarded. 

PALM achieves super-resolution imaging 
in three steps. The first step is to image light- 
activated fluorescent molecules that act as tiny, 
randomly distributed pinpricks of light. The 
use of low light intensities and the fact that the 
molecules’ activation is inherently random 


Image 2 


Pinpoint 


Blood vessels 


Compare 


NEWS & VIEWS | RESEARCH | 


14.Roessingh, S., Wolfgang, W. & Stanewsky, R. 
J. Biol. Rhythms http://dx.doi.org/ 
10.1177/0748730415605633 (2015). 

15.Brown, S. A., Zumbrunn, G., Fleury-Olela, F., Preitner, 
N. & Schibler, U. Curr. Biol. 12, 1574-1583 
(2002). 

16.Buhr, E. D., Yoo, S. H. & Takahashi, J. S. Science 330, 
379-385 (2010). 


This article was published online on 18 November 2015. 


ensures that only a sparse subset is turned on 
at any one time. Thus, these point-like light 
sources are separated by more than half a 
wavelength, so the image of each one (a blurred 
spot called the point spread function) does not 
overlap with that of its neighbours. 

The second step is to determine the exact 
position of each point-like source by find- 
ing the centre of the point spread function. 
This is possible for well-separated sources, 
because the shape of the point spread func- 
tion can be known in advance. The final step 
is to repeat the illumination and detection 
steps many times. A different set of separated 
point-like sources is detected each time, until 
a sufficient density of source points has been 
obtained. By marking the positions of all of 
these point sources on a single meta-image, 
a super-resolved picture can be built up. The 
spatial resolution in this image can exceed the 
diffraction limit, because it is determined by 
the accuracy with which the position of each 
source can be estimated. 

Coulda similar approach be used to achieve 
super-resolution ultrasound imaging? The 
first challenge is to identify potential point 
sources (point scatterers in the case of ultra- 
sound). Because small blood vessels are poor at 
reflecting sound waves, they can be hard to see 
using ultrasound, and gas-filled microbubbles, 
which reflect sound well, have long been used 
as contrast agents to enhance vessel visibility. 
Microbubbles are strong scatterers of sound, 
and so are good candidates as point sources. 
But to be useful in this context, there must be 
some way to identify them separately in ultra- 
sound images. 


oS 


Repeat 
x75,000 


Composite image 


= 


Figure 1 | Vessels visualized. Errico et al.’ obtained ultrasound images of rat blood vessels infused with gas-filled microbubbles, which reflect ultrasound 
waves. By taking images rapidly (at 500 frames per second) and generating difference data to compare sequential images, they were able to pinpoint the locations 
of the few, well-separated microbubbles that degraded between each image (only a small number of microbubbles are shown here for simplicity, and are not 

to scale). By repeating this process over many frames, a composite image was built up that revealed the locations of many thousands of microbubbles, making 
super-resolution images possible in around 150 seconds. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 451 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


In 2013, researchers achieved super- 
resolution ultrasound imaging by using a 
sufficiently dilute solution of microbubbles to 
achieve the necessary separation’. Earlier this 
year, the same group used this approach to 
obtain super-resolution images of the micro- 
vasculature in a mouse ear to a depth of more 
than one centimetre’. They also tracked the 
microbubbles, to estimate blood-flow velocity. 
However, their system acquired images at the 
low rate of 25 frames per second, which meant 
that hour-long imaging times were needed to 
achieve super-resolution. 

Errico et al. used a different approach that 
dispensed with the need for a dilute micro- 
bubble solution. By using a high-frame-rate 
imaging system (500 frames per second), 
they were able to detect the waves scattered 
from individual microbubbles by comparing 
sequential images. Signals from bubbles that 
have disintegrated or moved significantly in 
the time between frames can be detected in the 
data, and — as long as these changes are sparse 
enough to be spatially separated from one 
another — the positions and velocities of these 


microbubbles can be accurately determined*” 
(Fig. 1). The authors compared 75,000 images 
taken over about 150 seconds to build up 
super-resolution images of the vasculature 
in the cortical region of rat brains, through 
both intact rat skulls and skulls that had been 
thinned to reduce the acoustic attenuation. 

Could this technique be translated to a 
clinical setting? At the ultrasound wave- 
lengths used by Errico and colleagues, over- 
coming the attenuating effect of the thick 
human skull will present a considerable 
challenge. The authors point out that it might 
be possible to circumvent this problem by 
using longer wavelengths, which are less 
severely attenuated. Nonetheless, imaging of 
less-challenging targets that do not require 
the ultrasound waves to pass through thick 
bone should be readily achievable. One dis- 
advantage of the new approach compared with 
conventional ultrasound imaging is the need 
to administer a contrast agent; this requires an 
intravenous cannula and can increase clinical 
scanning time. 

Super-resolution ultrasound imaging of 


Transition loses 
its invasive edge 


Two studies provide evidence that epithelial tumour cells do not need to 
transition to a mesenchymal-cell state to form metastases, but that this process 
does contribute to drug resistance. SEE ARTICLE P.472 & LETTER P.525 


SHYAMALA MAHESWARAN 
& DANIEL A. HABER 


ancer often becomes lethal only when 
cells from the primary tumour dis- 
seminate to another organ. The early 
steps of this highly complex process, called 
metastasis, have been thought’ to rely on 
non-motile epithelial tumour cells acquiring 
characteristics of mesenchymal cells, which are 
more migratory. This change is known as the 
epithelial-to-mesenchymal transition (EMT). 
The migrating cancer cells then undergo a 
reverse mesenchymal-to-epithelial transition 
when they seed a secondary tumour’. Meta- 
stases therefore display the same epithelial-cell 
predominance as primary cancers, leaving no 
evidence of their transient mesenchymal state. 
Now, two papers in this issue (Fischer et al.’ 
and Zheng et al.*) present data that challenge 
the role of EMT as a crucial effector of cancer 
metastasis. 
In terms of cancer-cell characteristics, the 
epithelial lineage is associated with increased 
proliferation, whereas the mesenchymal 


lineage is linked to enhanced avoidance of 
anoikis (a form of death that occurs when cells 
detach from their normal tissue matrix) and 
of drug-induced death. Current understand- 
ing of EMT-induced metastasis is derived 
from in vitro data and mouse models of 
human cancers. But the inherent plasticity of 
the process and the limited clinical evidence 
supporting the occurrence of EMT in tumour 
specimens~ have led to scepticism about 
EMT being the predominant mechanism 
governing the early steps of metastasis. EMT 
is, however, emerging as one of numerous 
mechanisms conferring resistance to various 
cancer therapies®. 

Using mouse models of mammary tumours, 
Fischer and co-workers (page 472) surveyed 
the fate of epithelial tumour cells transition- 
ing to a mesenchymal state, from the cells’ 
inception and dissemination through the 
bloodstream to their exit from blood vessels 
and metastatic growth. To do this, the authors 
monitored the expression of green fluorescent 
protein (GFP) as a proxy for the expression 
of the genes that encode fibroblast-specific 


452 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


microvasculature is an exciting prospect. The 
technique has the potential to substantially 
advance the study of normal blood-vessel 
function, as well as disease. Moreover, it might 
enable doctors to readily identify microvessel- 
related disorders, such as tumour-related 
vessel growth and microvascular abnormalities 
in deep abdominal organs such as the kidneys, 
and to assess cardiovascular disease. m 


Ben Cox and Paul Beard are in the 
Department of Medical Physics and 
Biomedical Engineering, University 
College London, London WCIE 6BT, UK. 
e-mail: b.cox@ucl.ac.uk 


1. Errico, C. et al. Nature 527, 499-502 (2015). 

2. Viessmann, O. M., Eckersley, R. J., Christensen- 
Jeffries, K., Tang, M. X. & Dunsby, C. Phys. Med. Biol. 
58, 6447-6458 (2013). 

3. Christensen-Jeffries, K., Browning, R. J., Tang, M. X., 
Dunsby, C. & Eckersley, R. J. /EEE Trans. Med. Imag. 
34, 433-440 (2015). 

4. Couture, O., Besson, B., Montaldo, G., Fink, M. 

& Tanter, M. Proc. IEEE Int. Ultrasonics Symp. 
1285-1287 (2011). 

5. Desailly, Y., Couture, O., Fink, M. & Tanter, M. Appi. 

Phys. Lett. 103, 174107 (2013). 


protein 1 (FSP1) or vimentin, which are 
triggered when epithelial tumour cells switch 
to a mesenchymal state. The green fluores- 
cence persists in the progeny of these cells well 
after they revert to an epithelial fate. 

In two mouse models, the authors found 
evidence for EMT in a minor fraction of cells in 
the primary tumour and in a subset of circulat- 
ing tumour cells. However, the vast majority of 
metastatic tumours were not derived from the 
mesenchymal-switched cells expressing GFP, 
but from disseminating epithelial cells (Fig. 1). 
The researchers further show that inhibiting 
expression of the genes ZEB1 and ZEB2, which 
locks tumour cells in the epithelial state, did 
not impair metastasis of the mouse mammary 
tumours to the lung. 

Zheng et al. (page 525) reached a similar 
conclusion using tissue-specific deletion of 
the EMT-inducing transcription factors Snail 
or Twist to assess the consequences of EMT 
in a mouse model of pancreatic cancer. They 
found that loss of either Snail or Twist in the 
pancreatic epithelium does not affect tumour 
formation or overall survival, but it does sup- 
press EMT in the primary tumour. Despite 
the lower frequency of cells expressing mes- 
enchymal marker proteins in these tumours 
compared with those in mice in which Snail 
and Twist are expressed at normal levels, there 
were a similar number of metastases in the 
liver, lungs and spleen of these mice. 

Although they question the role of EMT 
in metastatic dissemination, both research 
groups go on to conclude that EMT con- 
tributes to drug resistance. In Fischer and 
colleagues’ mammary-tumour model, treat- 
ment with the drug cyclophosphamide led 
to enhanced survival and proliferation of 


Primary tumour 


Epithelial 
tumour cell 


Mesenchymal 
cell after EMT 


Figure 1 | Metastatic potential. A small fraction of epithelial cells ina 
solid tumour acquires mesenchymal-cell characteristics during tumour 
progression, through a process known as epithelial-to-mesenchymal 
transition (EMT). Both epithelial and mesenchymal cancer cells can invade 
the bloodstream and exit it at distant sites, where the mesenchymal cells 
undergo a reverse mesenchymal-to-epithelial (MET) transition. Contrary 


EMT-switched tumour cells that had reached 
the lung. Analysis of the genes being expressed 
in these cells correlated this resistance with 
increased expression of genes encoding drug- 
metabolizing enzymes and drug-transporter 
proteins. These findings were mirrored in 
Zheng and colleagues’ Snail- or Twist-deleted 
pancreatic cancers, in which tumour cells with 
epithelial characteristics expressed higher 
levels of nucleoside-transporter proteins than 
did mesenchymal cells, potentially render- 
ing the epithelial cells more sensitive to the 
chemotherapeutic drug gemcitabine. 

These findings challenge the prevailing 
hypothesis that EMT is a key element in the 
metastatic dissemination of epithelial cancers, 
and they point to a distinct role of this cell-fate 
transition in enhancing cancer-cell survival 
during drug treatment. How can these data be 
reconciled with compelling previous reports 
on the role of EMT in metastasis? The earlier 
work studied the effects of EMT induced by 
the growth factor TGF® or by overexpression 
of transcriptional regulators such as Snail or 
Twist. Such approaches might not capture the 
physiological process that occurs spontane- 
ously in cancer cells as accurately as do the 
methods used in the current studies. It is prob- 
able that the induction of EMT that occurs 
during the natural progression of a cancer 
may be more subtle than the full EMT switch 
that is induced by the expression of powerful 
regulators and that has been associated with 
high levels of metastasis. 

However, the limitations of the models used 
by Fischer et al. and Zheng et al. need to be 
considered before EMT-mediated tumour 
invasion can be dismissed outright. EMT is 
orchestrated by complex circuitry involving 
multiple signalling molecules and transcription 
factors. Tracing switched cells on the basis of 
expression of a single gene may therefore not 
fully capture these complicated features. Simi- 
larly, Snail and Twist function redundantly in 
many settings””®, and inactivation of both (and 
potentially of other transcriptional regulators) 
simultaneously, rather than individually, may 


cos se es - 


Distant organ 


Metastasis 


Blood vessel 


be required to abrogate EMT. Furthermore, 
cancer is a highly variable disease, and its full 
complexity cannot be completely captured in 
mouse models that are driven by expression of 
a few cancer-initiating genes. Nonetheless, the 
conclusions reached by the two studies warrant 
a re-evaluation of the role of EMT in cancer 
progression. Alternative ways in which epithe- 
lial cells could enter the bloodstream without 
acquiring mesenchymal properties, such as 
collective epithelial-cell migration’ or tumour 
fragmentation”, are worth investigating. 

The postulated role of EMT in mediating 
cancer-cell survival is reinforced by the two 
latest studies. Indeed, EMT has been linked 
to drug susceptibility of cancer cells, as well as 
to their entrance into a non-proliferative state 
in which they have stem-cell-like properties”. 
Understanding the many cellular pathways 
that together determine these cell fates, and 
how these pathways are modulated, is likely to 
provide fertile ground for drug discovery and 
for new therapeutic strategies. m 


NEWS & VIEWS | RESEARCH | 


Epithelial cell after MET 


——— 


Chemotherapy 


to previous opinion, Fischer et al.* and Zheng et al.’ find that the majority of 
metastatic tumours at secondary sites are initiated by epithelial cells from the 
primary tumour, and not by cells that have undergone EMT and subsequent 
MET. However, both research groups also show that such transitioned cells are 
more resistant to chemotherapeutic drugs than are untransitioned epithelial 
cells and emerge as the dominant metastatic population following treatment. 


Shyamala Maheswaran and Daniel A. Haber 
are at the Massachusetts General Hospital 
Cancer Center, Harvard Medical School, 
Boston, Massachusetts 02114, USA. 

e-mails: maheswaran@helix.mgh.harvard.edu; 
dhaber@mgh.harvard.edu 


1. Thiery, J. P., Acloque, H., Huang, R. Y. & Nieto, M. A. 
Cell 139, 871-890 (2009). 
2. Ye, X. & Weinberg, R. A. Trends Cell Biol. 25, 
675-686 (2015). 
. Fischer, K. R. et a/. Nature 527, 472-476 (2015). 
. Zheng, X. et al. Nature 527, 525-530 (2015). 
. Yu, M. et al. Science 339, 580-584 (2013). 
. Singh, A. & Settleman, J. E Oncogene 29, 4741-4751 
(2010). 
7. Peinado, H., Olmeda, D. & Cano, A. Nature Rev. 
Cancer 7, 415-428 (2007). 
8. Puisieux, A., Brabletz, T. & Caramel, J. Nature Cell 
Biol. 16, 488-494 (2014). 
9. Clark, A. G. & Vignjevic, D. M. Curr. Opin. Cell Biol. 36, 
13-22 (2015). 
10.Aceto, N. et al. Cel! 158, 1110-1122 (2014). 
11.Rumman, M., Dhawan, J. & Kassem, M. Stem Cells 
33, 2903-2912 (2015). 


ao WwW 


This article was published online on 11 November 2015. 


Hidden reservoirs 


West Africa’s Ebola epidemic continues to reveal surprises. Although the animal 
species that originally passed the virus to people remains a mystery, a virus 
reservoir and persistent disease have been identified in some human survivors. 


JONATHAN L. HEENEY 


nimals are reservoirs for many patho- 
gens that occasionally jump species 
and infect humans. In December 2013 
in the forests of Guinea, a two-year-old boy 
became infected with the Zaire strain of Ebola 
virus from an unidentified animal source’. 
This event triggered the largest and longest 
human epidemic of Ebola viral infection in 
recorded history. Across several countries in 


West Africa, over 28,000 people were infected 
and more than 11,000 died. This fatality rate of 
less than 50% was lower than in most previous 
outbreaks, and it left more than 16,000 survi- 
vors’. Studies of these survivors are changing 
our understanding of Ebola virus infection and 
raising concern for the long-term well-being 
of these individuals and their communities. 
Writing in the New England Journal of Medi- 
cine, Deen et al.’ reveal that Ebola virus RNA 
can persist in the semen of men for months 


26 NOVEMBER 2015 | VOL 527 | NATURE | 453 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


Reservoirs 


Human 


— 


Amplifying hosts 


Survivor with 
asymptomatic infection 


Survivor with 


iy! 
‘yt! 


Survivor 
cleared of virus 


Individual with 
symptomatic disease 


Human survivor population 


Figure 1 | Ebola infection dynamics in animals and humans. Ebola virus has been identified in several 
animal species, including bats, chimpanzees and forest antelopes. Transmission to humans can occur 
directly from reservoir species, in which the virus may persist without causing active infection, or from 
amplifying host species, in which the virus replicates to high levels, often causing illness and death. Most 
infected people develop acute Ebola virus disease and are highly infectious, although some individuals 
survive exposure and infection without developing symptoms. There is also growing evidence™ that the 
virus can persist in the central nervous system and reproductive organs of some survivors of the disease, 
with the possibility that these survivors could infect others months after resolution of their acute symptoms’. 


after their recovery from the disease, and Mate 
et al.* demonstrate that such persistence can be 
the source of new infections through sexual 
transmission. 

Deen and colleagues obtained semen samples 
from 93 Sierra Leonian men who had survived 
Ebola virus disease (EVD) at various intervals 
after the onset of their disease. Although the 
authors find that the proportion of men whose 
semen contained Ebola virus RNA waned with 
time, the viral genomes persisted for as long 
as 7-9 months after recovery. Mate and col- 
leagues provide convincing evidence that a 
female patient in Liberia, who subsequently 
died, had contracted Ebola virus through 
unprotected vaginal intercourse with her male 
partner, who had survived EVD. 

These observations support similar findings 
in previous epidemics of filoviruses, the virus 
family to which Ebola belongs. There have 
been reports”® of the persistence of Marburg 
virus in the anterior chamber of the eye and 
semen of human survivors, and of the persis- 
tence of Ebola virus in the semen of men who 
survived the 1995 outbreak in the Democratic 
Republic of the Congo’. This has obvious 
implications for sexual partners. 

The fact that Ebola virus is found at high levels 
in placental tissues also suggests that transmis- 
sion could occur from pregnant women who 
survive EVD to their babies, although pregnant 
women who become infected usually abort the 
fetus before term*. Mother-to-child transmis- 
sion by breastfeeding in survivors of Marburg 
virus has been reported’, and the potential for 
transmission through breast milk has also been 
suggested for Ebola”. 

Although the relative risk of virus 
transmission from survivors is low compared 


with transmission from patients with acute 
EVD, a single case of new infection is sufficient 
to trigger an epidemic (Fig. 1). Thus, there is 
a strong need for rigorous assessment of the 
tissue reservoirs of Ebola in human survi- 
vors and the associated public-health risks. 
Follow-up health care should be combined 
with compassionate education of survivors 
and their communities by qualified and 
knowledgeable personnel, including advice on 
condom use. 

Another lesson to emerge from this 
epidemic is that some survivors experience 
symptoms after their recovery from the main 
disease episode, suggesting that viral persis- 
tence in certain compartments of the body is 
more serious in some survivors than previ- 
ously recognized. Reported symptoms include 
blurred vision, pain behind the eyes, hearing 
deficits, painful swallowing, joint pain, fever, 
memory loss and difficulty in sleeping’’”. 
The rehospitalization of a British nurse who 
developed neurological complications more 
than 9 months after surviving acute EVD" isa 
chilling indication that the virus can persist in 
the central nervous system and be triggered to 
reactivate or to escape immune surveillance, 
or both. Fortunately, diagnosis and success- 
ful clinical intervention were possible for the 
nurse in Britain, but this situation is unlikely 
in most communities in West Africa. 

The existence of a reservoir state in human 
Ebola survivors is now beyond debate. But 
we do not know how long viable virus can 
persist in these tissue reservoirs, nor whether 
the virus replicates there at low levels or is 
dormant and then triggered to replicate. Better 
definition and understanding of the reservoirs 
and the underlying mechanisms of post-EVD 


454 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


symptoms are needed to inform clinical 
management and treatment. 

For example, studies of survivors may 
identify features of their immune responses 
(such as neutralizing-antibody determinants) 
that correlate with either full viral clearance or 
the persistence of viral reservoirs. Such corre- 
lates may enable survivors to be classified into 
‘carrier or ‘cleared’ subtypes. Potential factors 
that could predispose survivors to viral re- 
emergence also need to be taken into account, 
including genetics, compromised immunity 
owing to poor health, concurrent infections 
such as HIV, or use of immunosuppressive 
drugs. However, Ebola, like other RNA viruses, 
may be prone to mutational changes, and virus 
escape from the host’s immune response may 
eventually occur even without predisposing 
factors. 

It is also not clear how, or whether, post- 
EVD immunity is affected by the stage of 
treatment or type of therapy given, such as 
monoclonal antibodies or the antibodies in 
convalescent plasma. As well as helping to 
classify survivors, enhanced understanding of 
viral persistence will help to guide therapeu- 
tic choices — treatment with small antiviral 
molecules, for example, may facilitate full 
clearance of the virus. 

Although we are learning much about 
Ebola from this epidemic, we have yet to 
identify the events that caused the virus to 
jump to the Guinean boy almost two years 
ago. The consumption of bushmeat has been 
associated with previous epidemics, and 
some bushmeat species, such as great apes 
and forest antelopes, are susceptible to high 
levels of Ebola-virus replication and die from 
the infection. They are thus best considered 
as amplifying hosts, rather than the initial 
reservoir species (Fig. 1). Prime suspects for 
the reservoir include several species of bat, 
although a bat source has not been confirmed 
for this latest epidemic". Indeed, the animal 
reservoirs of Ebola may be cloaked by seques- 
tration of the virus in much the same way as 
its persistence in human survivors, waiting 
for physiological triggers for transmission to 
unexposed animals of the same species or to 
amplifying hosts. 

Understanding the triggers of Ebola emer- 
gence, the persistence of the virus in humans 
and the infection dynamics in its animal reser- 
voirs is vital not only for the long-term care of 
survivors of this epidemic, but also for prevent- 
ing the next one. m 


Jonathan L. Heeney is in the Laboratory of 
Viral Zoonotics, University of Cambridge, 
Cambridge CB3 OES, UK. 

e-mail: jlh66@cam.ac.uk 


1. Baize, S. etal. N. Engl. J. Med. 371, 1418-1425 (2014). 

2. World Health Organization. http://apps.who.int/ 
ebola/ebola-situation-reports 

3. Deen, G. F. et al. N. Engl. J. Med. http://dx.doi. 
org/10.1056/NEJMoa1511410 (2015). 


4. Mate, S. E. et al. N. Engl. J. Med. (2015). http:// 
dx.doi.org/10.1056/NEJMoa1509773 (2015). 

5. Martini, G. A. Trans. R. Soc. Trop. Med. Hyg. 63, 
295-302 (1969). 

6. Smith, D. H. et a/, Lancet 319, 816-820 (1982). 

7. Rodriguez, L. L. et al. J. Infect. Dis. 179 (Suppl. 1), 
$170-S176 (1999). 

8. Baggi, F. M. et a/. Eurosurveillance 
www.eurosurveillance.org/ViewArticle. 
aspx?Articleld=20983 (2014). 


PLANETARY SCIENCE 


9. Borchert, M. et al. Trop. Med. Int. Health 7, 902-906 
(2002). 

10.Bausch, D. G. et al. J. Infect. Dis. 196 (Suppl. 2), 
$142-S147 (2007). 

11.Gulland, A. Br. Med. J. 351, h4336 (2015). 

12.Clark, D. V et al. Lancet. Infect. Dis. 15, 905-912 
(2015). 

13.Nursing Stand. 30 (8), 8 (2015). 

14. Pigott, D. M. et al. eLife http://dx.doi.org/10.7554/ 
eLife.04395 (2014). 


The Moon’s tilt for gold 


The Moon’s current orbit is at odds with theories predicting that its early orbit 
was in Earth’s equatorial plane. Simulations now suggest that its orbit was tilted 
by gravitational interactions with a few large bodies. SEE LETTER P.492 


ROBIN CANUP 


impact with Earth is thought to have 

created an Earth-orbiting disk of debris 
that coalesced to form the Moon. ‘Inelastic’ 
collisions between such debris would dissi- 
pate energy and remove relative up-and-down 
motions, so that the Moon that assembled from 
these collisions would orbit approximately 
in Earth’s equatorial plane. Yet the Moon's 
current orbit implies that its initial orbit 
was substantially inclined relative to Earth’s 
Equator’, a troubling contradiction. 

On page 492 of this issue, Pahlevan and 
Morbidelli’ identify a compelling and simple 
solution to this problem — that the Moon’s 
early orbit was gravitationally jostled into a 
tilted state by close passes of large objects left 
over from the formation of the inner plan- 
ets. The existence of a population of these 
objects could also explain how elements such 
as iridium, platinum and gold were deliv- 
ered to Earth's outer layers after the Moon 
formed’, 

The Earth-Moon pair is a dynamically 
coupled system. The Moon’ gravity raises tides 
on Earth, most notably in the oceans, and grav- 
itational interactions between these tides and 
the Moon is causing Earth's rotation to slow 
as the Moon’s orbit expands. Tidal interac- 
tions also reduce the tilt of the Moon’s orbit 
relative to a preferred plane. This would have 
coincided with Earth's equatorial plane when 
the early Moon was orbiting close to Earth 
and transitioned to the plane of Earth’s orbit 
around the Sun as the Moon's orbit expanded. 
In the absence of other effects, the current 
5° inclination of the Moons orbit relative 
to Earth's orbital plane implies an initial 10° 
inclination relative to Earth’s equatorial plane 
when the Moon formed’, 10 times larger than 
expected according to theory’. 

A seemingly unrelated — until now — set 
of clues about the conditions soon after the 


FE our and a half billion years ago, a giant 


Moon formed emerge from the abundance of 
precious metals in the Earth. Elements such 
as platinum and gold are highly siderophile, 
which means that they have strong chemical 
affinities for iron. Because Earth formed ina 
largely molten state, high-density iron would 
have readily sunk to the planet’s centre to form 
acore, taking highly siderophile elements with 
it and efficiently removing these from Earth’s 
upper layers. The fact that we find such ele- 
ments in relatively high abundance in rocks at 
Earth's surface suggests that they were deliv- 
ered to the planet after the end of core forma- 
tion, through a ‘late veneer’ of material that 
added about the last 1% of Earth’s mass”. 

If Earth’s late veneer was delivered by a large 
number of small impactors, the Moon would 
have received about 1/20th as many impactors 
on the basis of its smaller cross section’. But 
lunar siderophile abundances imply that the 
Moon received much less than that amount. 
It thus seems probable that Earth’s late veneer 
was delivered by only a few large impactors, 
each roughly comparable in size to the Moon, 
because the Moon would have received less 
than its proportionate share under these 
circumstances. 

Pahlevan and Morbidelli use computa- 
tional methods (Monte Carlo simulations) 
to consider the effects of such a population 
of large, late-accreting background objects 
on the Moons early orbit. Their simulations 
begin with a Moon orbiting in Earth’s equa- 
torial plane close to our planet (Fig. 1). With 
time, the Moon's orbit expands because of tidal 
interaction with Earth, and is gravitationally 
perturbed by the background objects until this 
population is depleted over typically a few tens 
of millions of years. 

Central to the new work is the recogni- 
tion that each object that ultimately collides 
with Earth first undergoes many thousands 
of non-collisional close passes, a portion of 
which strongly perturb the Moon's orbit. An 
object approaching the Moon from a random 


NEWS & VIEWS | RESEARCH | 


50 Years Ago 


The use of rubber gloves during 
surgical operations became 

general about 1900 ... The object 
of an investigation was to obtain 

an estimation of how frequently 
wound infection originates from 
bacteria on the hands of operating 
staff... Examination of the wounds 
following 433 ‘clean’ operations, 

of the 3,125 rubber gloves used 

in those operations and of the 
bacterial flora of the hands which 
had worn 692 damaged gloves, 
revealed no connexion between the 
glove damage, the bacterial flora 
and the wound infections observed. 
From Nature 27 November 1965 


100 Years Ago 


The Times of November 20 
published a rather flamboyant little 
article, headed “A Surgical Schism? 
This article said: “Not for halfa 
century at least has the medical 
world been so sharply divided as it 
is to-day in regard to the question 
of the treatment of wounds.’ Now, 
it is exactly halfa century since 
Lister ... first ventured to treat a 
compound fracture by plugging the 
wound with a strip of rag soaked 

in undiluted and impure German 
creasote. Pyaemia and septicaemia 
and erysipelas were ravaging the 
wards of the old Glasgow Infirmary, 
and he, relying on Pasteur’s work 
on the “germs of putrefaction,” 

and knowing that creasote was 

a good “disinfectant,” plugged 

a wound with it. That was the 
beginning of everything, exactly 
half a century ago. To-day, there 
are many methods, but they do 

not all contradict or exclude each 
other ... We must notimagine a 
sort of desperate squabble among 
our military surgeons... The 
suggestion in the Times article that 
an acute controversy is proceeding 
upon these matters is unfortunate 
and misleading. 

From Nature 25 November 1915 


26 NOVEMBER 2015 | VOL 527 | NATURE | 455 


© 2015 Macmillan Publishers Limited. All rights reserved 


NEWS & VIEWS 


Equatorial 


Figure 1 | Collisionless interactions could have altered the early Moon’s orbit. a, When the Moon first 
formed, its orbit was approximately in the plane of Earth’s Equator. Over time, its orbit then expanded. 

b, Pahlevan and Morbidelli’” propose that collisionless interactions with large objects passing through 

the early Solar System would have strongly perturbed the Moon's orbit. c, The cumulative effect of such 
interactions would have tilted the Moon’s orbital plane sufficiently to explain the current inclination of the 
Moon to Earth’ orbital plane around the Sun. The Moon’s orbital radius in b and c is not shown to scale. 


direction may increase or decrease the Moon's 
orbital tilt. But just as a series of steps, each 
equally likely to be forward or backward, 
causes the standard deviation in the net 
distance travelled to increase with time, so 
too does a series of randomly oriented kicks 
to the lunar orbit lead to a general increase 
with time in the probability of exciting a 
minimum tilt. 

Pahlevan and Morbidelli’s results show a 
high likelihood that such random scattering 
events can cumulatively produce the neces- 
sary early tilt in the Moon’s orbit, as long as the 
number of objects that deliver the final approx- 
imately 1% of Earth’s mass is small (fewer 
than 5) and the rate of early tidal expansion 
of the Moon's orbit is sufficiently rapid. The 
rate of early tidal expansion needed is broadly 
consistent with the average tidal properties 
inferred for Earth on the basis of the expan- 
sion of the Moon's orbit to its current orbital 
distance. However, the specific values that 
would have applied to the earliest Earth remain 
uncertain. 

The magnitude of the excited tilt scales 
roughly linearly with the late mass deliv- 
ered to Earth. It is not known what fraction 
of the siderophiles that were concentrated in 
the cores of such large impactors would have 
been retained in Earth's upper layers. Improved 
models of late-veneer impacts should therefore 


be used to better constrain the late-accreted 
mass; this would in turn allow a closer approxi- 
mation of the inclination expected from scat- 
tering. Moreover, the new scattering model 
is most effective ifthe Moons inclination has 
been damped only by tides. If other forms 


of inclination damping have occurred, then 
— depending on the timing of this damp- 
ing — the required initial inclination might 
increase, and with it the required mass of 
background objects, perhaps to unrealistically 
high values. 

Previously reported models for the origin 
of the Moon’s inclination rely on more- 
complex processes involving either a periodic 
gravitational interaction (gravitational reso- 
nance) with the Sun’ or a resonant interaction 
between the Moon and its precursor disk’. 
Both require rather narrow sets of conditions 
for success. The new mechanism is simpler 
than these models, and the population of late 
lunar-sized objects that it requires is compel- 
lingly consistent with that needed to account 
for the delivery of Earth’s precious metals, a 
completely independent constraint. Had such 
a population of objects not existed, the Moon 
might be orbiting in Earth’s orbital plane, with 
total solar eclipses occurring as a spectacular 
monthly event. But our jewellery would be 
much less impressive — made from tin and 
copper, rather than from platinum and gold. = 


Robin Canup is in the Planetary Science 
Directorate, Southwest Research Institute, 
Boulder, Colorado 80302, USA. 
e-mail: robin@boulder.swri.edu 


1. Goldreich, P. Rev. Geophys. 4, 411-439 (1966). 

2. Pahlevan, K. & Morbidelli, A. Nature 527, 492-494 
(2015). 

3. Bottke, W. F, Walker, R. J., Day, J. M. D., Nesvorny, D. 
& Elkins-Tanton, L. Science 330, 1527-1530 
(2010). 

4. Ida, S., Canup, R. M. & Stewart, G. R. Nature 389, 
353-357 (1997). 

5. Walker, R. J. Chem. Erde 69, 101-125 (2009). 

6. Touma, J. & Wisdom, J. Astron. J. 115, 1653-1663 
(1998). 

7. Ward, W. R. & Canup, R. M. Nature 403, 741-743 
(2000). 


Assassins of eyesight 


A molecular cascade involving the transcription factor SIX6 and its target 
gene p16INK4a causes the death of neurons that link the eye to the brain. This 
discovery deepens our understanding of a common form of blindness, glaucoma. 


ANDREW D. HUBERMAN & RANA N. EL-DANAF 


ision might feel easy, but an immense 

number of neurons are required to 

perform routine visual functions, such 
as reading, navigating the street or recogniz- 
ing faces. Tightly lining the back of the eye is 
a layer of approximately 1 million neurons 
called retinal ganglion cells (RGCs), which take 
information encoded by the retina and pass it 
to the brain’. Glaucoma — a disease marked 
by progressive, irreversible degeneration of 


456 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


RGCs — is a common form of blindness, 
affecting more than 60 million people world- 
wide’. Although many studies have sought to 
understand the cellular and molecular basis of 
glaucoma’ , the mechanisms that drive RGC 
death in this debilitating disease have remained 
mysterious. But writing in Molecular Cell, 
Skowronska-Krawczyk et al.* report that certain 
glaucoma-associated mutations in humans 
are linked to a defined molecular pathway that 
accelerates RGC ageing and death. 

A constellation of risk factors has been 


associated with glaucoma, one of the greatest 
of which is age. Like other forms of neuro- 
degeneration, loss of RGCs occurs more 
often in people over 60, raising questions 
about whether similar mechanisms might 
underlie glaucoma and other age-related 
neurodegenerative disorders such as 
Alzheimer’s disease’. There also seems to be a 
strong genetic component to glaucoma, with 
certain forms occurring four to five times more 
frequently in dark-skinned people’. Finally, 
the disease is often thought to be caused by 
elevated fluid pressure inside the eye. How- 
ever, abnormally high intraocular pressures are 
neither 100% predictive of nor a prerequisite 
for glaucoma, and many people with the dis- 
ease have normal eye pressures’. This broad 
range of risk factors has led many to speculate 
that glaucoma is caused by a variety of indivi- 
dual stressors that all increase RGC susceptibil- 
ity to death. The key questions have therefore 
become: what are the common molecular 
pathways that trigger RGC loss, and how could 
those pathways be manipulated for therapies? 

Skowronska-Krawczyk et al. analysed 
genetic-association studies in several human 
populations to find genes that are commonly 
mutated in people with primary open-angle 
glaucoma (the most common form of the 
disease). One screen picked up SIX6, which 
encodes a transcription factor that helps to 
shape the eye during embryonic and postnatal 
development’. A mutation called His141, which 
changes amino-acid residue 141 of the SIX6 
protein from asparagine to histidine, confers 
a risk of glaucoma. The authors performed a 
careful structural analysis, which revealed that 
this residue probably lies outside the transcrip- 
tion factor’s DNA-binding domain. Instead, 
the mutation might affect the ability of SIX6 to 
interact with other transcription factors or with 
co-factor proteins, altering the efficiency with 
which the protein can activate its target genes. 

To identify possible target genes for SIX6, 
Skowronska-Krawcezyk and colleagues again 
turned to genetic-association studies. These 
indicated that mutations in the pl6INK4a 
gene are a strong risk factor for glaucoma. 
The authors found that expression of both 
p16INK4a and SIX6 was higher in eyes of 
people with glaucoma than in those of healthy 
people. Moreover, they demonstrated that 
SIX6 binds to and activates p 16INK4a. 

In many cell types, p16INK4a is associated 
with a cellular ageing process called senes- 
cence. Skowronska-Krawczyk et al. found that 
approximately four times more RGCs were 
senescing in patients with glaucoma than in 
healthy people. To probe this pathway further, 
the authors engineered human retinal progeni- 
tor cells cultured in vitro to express the SIX6 
His141 mutation. The mutant protein strongly 
upregulated p16INK4a and another marker of 
cellular senescence, the IL-6 gene. This effect 
seems to be specific to the His141 mutation, 
because upregulation of these markers did 


p16INK4a 


NEWS & VIEWS | RESEARCH | 


Degradation 
and death 


=» 


RGC 
senescence 


Figure 1 | Molecular pathways that underlie glaucoma. Age, elevated pressure in the eye and certain 
genetic mutations are all associated with an increased risk of glaucoma, a form of blindness linked to the 
degradation of retinal ganglion cells (RGCs). Skowronska-Krawczyk et al.’ report that these risk factors 
converge on a single molecular cascade in which the transcription factor SIX6 binds to and activates 

the gene p16INK4a. Increased p16INK4a expression causes RGC senescence and, eventually, RGC 


degradation and death. 


not occur in cells producing wild-type SIX6 
or forms of SIX6 mutated at different residues. 
Together, the results indicate that the His141 
mutation increases the effectiveness with 
which SIX6 activates p 16INK4a and triggers 
senescence pathways in RGCs. 

Skowronska-Krawczyk and colleagues next 
explored whether activation of p16INK4a 
was linked to RGC ageing or death in mice in 
which intraocular pressure was experimen- 
tally raised. They found that expression of 
both SIX6 and p16INK4a increased markedly 
after experimental elevation of intraocular 
pressure. The evidence for an interaction 
between SIX6 and p16INK4a was further 
bolstered by the discovery that p16INK4a 
expression was reduced in mice lacking SIX6, 
and that elevated intraocular pressure increased 
SIX6-p16INK4a binding in wild-type mice. 
As in human glaucomatous retinas, increases 
in intraocular pressure dramatically elevated 
the number of senescent RGCs. Together, 
these results suggest that increased p16INK4a 
expression is a major cause of cellular- 
senescence pathways that ultimately lead to 
RGC degeneration and death in glaucoma. 

In a final set of experiments, the authors 
performed a crucial test of this model by assess- 
ing whether genetic deletion of p16INK4a or 
partial deletion of SIX6 impeded RGC death 
in a mouse model of glaucoma. Remarkably, 
when intraocular pressure was experimentally 
increased in either of these genetically mutated 
mouse strains, RGCs resisted death, strongly 
supporting the idea that SIX6-activated 
increases in p16INK4a mediate RGC loss in 
response to different stressors (Fig. 1). 

Skowronska-Krawczyk and colleagues’ 
study is an important step forward. First, it 
provides support for the long-held view that, 
even though different risk factors and stressors 
can increase the likelihood of glaucoma, there 
isa common molecular mechanism by which 
those stressors act to kill RGCs. Second, the 
study indicates that cellular senescence and 
its associated pathways are precursors to RGC 
degeneration and death. 

Over the past few years, there has been a 
surge in our understanding about which RGCs 


are most vulnerable in early-stage glaucoma®”, 
and of the ion channels required to trans- 
late intraocular pressure increases into RGC 
degradation and death". The current study pro- 
vides a solid molecular foundation on which to 
integrate these findings. A more complete 
understanding of the biological underpinnings 
of glaucoma will no doubt also help to identify 
new targets for intervention, and might reveal 
mechanistic insights into the molecular basis of 
other age-related neurodegenerative diseases, 
such as Alzheimer’s and Parkinson's disease. = 


Andrew D. Huberman and RanaN. El-Danaf 
are in the Neurobiology Section, Division of 
Biological Science, and in the Departments of 
Neurosciences and Ophthalmology, School of 
Medicine, University of California, San Diego, 
La Jolla, California 92093, USA. A.D.H. is 
also at the Salk Institute for Biological Studies, 
La Jolla. 

e-mails: ahuberman@ucsd.edu; 
reldanaf@ucsd.edu 


1. Dhande, 0. S. & Huberman, A. D. Curr. Opin. 
Neurobiol. 24, 133-142 (2014). 

2. Kwon, Y.H., Fingert, J. H., Kuehn, M. H. & Alward, W. L. M. 
N. Engl. J. Med. 360, 1113-1124 (2009). 

3. Weinreb, R. N., Aung, T. & Medeiros, F. A. J. Am. Med. 
Assoc. 311, 1901-1911 (2014). 

4. Skowronska-Krawczyk, D. et a/. Mol. Cell 59, 
931-940 (2015). 

5. Jain, S. & Aref, A. A. J. Ophthalmic Vis. Res. 10, 
178-183 (2015). 

6. Tielsch, J. M. et al. J. Am. Med. Assoc. 266, 369-374 
(1991). 

7. Anderson, A. M., Weasner, B. M., Weasner, B. P. & 
Kumar, J. P. Development 139, 991-1000 (2012). 

8. Della Santina, L., Inman, D. M., Lupien, C. B., 
Horner, P. J. & Wong, R. O. J. Neurosci. 33, 
17444-17457 (2013). 

9. El-Danaf, R. N. & Huberman, A. D. J. Neurosci. 35, 
2329-2343 (2015). 

10.Ward, N. J., Ho, K. W., Lambert, W. S., Weitlauf, C. & 
Calkins, D. J. J. Neurosci. 34, 3161-3170 (2014). 


CORRECTION 

The News & Views article ‘Rehabilitation: 
Boost for movement’ by Randolph J. Nudo 
(Nature 527, 314-315; 2015) omitted 

to mention that the author has declared 
competing financial interests. Details are 
available in the online version of the article. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 457 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


OPEN 


doi:10.1038/nature16150 


Hemichordate genomes and 
deuterostome origins 


Oleg Simakov!?*, Takeshi Kawashima*+*, Ferdinand Marlétaz*, Jerry Jenkins°, Ryo Koyanagi®, Therese Mitros’, 
Kanako Hisata?, Jessen Bredeson’, Eiichi Shoguchi’, Fuki Gyoja’, Jia-Xing Yue*+, Yi-Chih Chen’, Robert M. Freeman Jr!°+, 


Akane Sasaki", Tomoe Hikosaka-Katayama!’, Atsuko Sato!’, Manabu Fujie®, Kenneth W. Baughman’, Judith Levine", 


14 


Paul Gonzalez!, Christopher Cameron", Jens H. Fritzenwanker"™, Ariel M. Pani!°, Hiroki Goto®, Miyuki Kanda®, Nana Arakaki®, 
Shinichi Yamasaki®, Jiaxin Qu!’, Andrew Creel’, Yan Ding!’, Huyen H. Dinh’’, Shannon Dugan”, Michael Holder", 

Shalini N. Jhangiani!’, Christie L. Kovar!’, Sandra L. Lee!’, Lora R. Lewis'’, Donna Morton", Lynne V. Nazareth”, 

Geoffrey Okwuonw”, Jireh Santibanez!’, Rui Chen!’, Stephen Richards’’”, Donna M. Muzny", Andrew Gillis!®, Leonid Peshkin”, 
Michael Wu’, Tom Humphreys”, Yi-Hsien Su’, Nicholas H. Putnam®+, Jeremy Schmutz°, Asao Fujiyama”, Jr-Kai Yu’, 
Kunifumi Tagawa!!, Kim C. Worley’, Richard A. Gibbs!”, Marc W. Kirschner!°, Christopher J. Lowe", Noriyuki Satoh, 


Daniel S. Rokhsar!”2! & John Gerhart’ 


Acorn worms, also known as enteropneust (literally, ‘gut-breathing’) hemichordates, are marine invertebrates that 
share features with echinoderms and chordates. Together, these three phyla comprise the deuterostomes. Here we 
report the draft genome sequences of two acorn worms, Saccoglossus kowalevskii and Ptychodera flava. By comparing 
them with diverse bilaterian genomes, we identify shared traits that were probably inherited from the last common 
deuterostome ancestor, and then explore evolutionary trajectories leading from this ancestor to hemichordates, 
echinoderms and chordates. The hemichordate genomes exhibit extensive conserved synteny with amphioxus and other 
bilaterians, and deeply conserved non-coding sequences that are candidates for conserved gene-regulatory elements. 
Notably, hemichordates possess a deuterostome-specific genomic cluster of four ordered transcription factor genes, the 
expression of which is associated with the development of pharyngeal ‘gill’ slits, the foremost morphological innovation 
of early deuterostomes, and is probably central to their filter- feeding lifestyle. Comparative analysis reveals numerous 
deuterostome-specific gene novelties, including genes found in deuterostomes and marine microbes, but not other 
animals. The putative functions of these genes can be linked to physiological, metabolic and developmental specializations 


of the filter-feeding ancestor. 


The prominent pharyngeal gill slits, rigid stomochord, and midline 
nerve cords of acorn worms led 19th century zoologists to designate 
them as ‘hemichordates’ and group them with vertebrates and other 
chordates'~“, but their early embryos and larvae also linked them to 
echinoderms”*. Current molecular phylogenies strongly support the 
affinities of hemichordates and echinoderms as sister phyla, together 
called ambulacrarians’, and unite ambulacrarians and chordates within 
the deuterostomes (see glossary in Supplementary Note 1). Of all the 
shared derived morphological characters proposed between hemichor- 
dates and chordates, the pharyngeal gill slits have emerged with unam- 
biguous morphological and molecular support, notably the shared 
expression of the pax1/9 gene! These structures were ancestral 
deuterostome characters elaborated upon the bilaterian ancestral body 
plan, but the gill slits were subsequently lost in extant echinoderms and 


amniotes’'. Since extant invertebrate deuterostomes use this apparatus 
for efficient suspension and/or deposit feeding, the early Cambrian or 
Precambrian deuterostome ancestor probably also shared this lifestyle. 
This perspective on the last common deuterostome ancestor informs 
our understanding of the subsequent evolution of hemichordates, 
echinoderms and chordates’”!?"!®, 

Hemichordates share bilateral symmetry, gill slits, soft bodies and 
early axial patterning with chordates, making them key comparators 
for inferring the ancestral genomic features of deuterostomes. To 
this end, we sequenced and analysed the genomes of acorn worms 
belonging to the two main lineages of enteropneust hemichordates 
(Supplementary Note 1): Saccoglossus kowalevskii (Harrimaniidae; 
Atlantic, North America, Fig. 1a) and Ptychodera flava (Ptychoderidae; 
Pacific, pan-tropical, Fig. 1b). Both have characteristic three-part 


1Molecular Genetics Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa 904-0495, Japan. @Department of Molecular Evolution, Centre for Organismal Studies, 


U 
95060, USA (N.H.P.). 
*These authors contributed equally to this work. 


University of Heidelberg, 69115 Heidelberg, Germany. Marine Genomics Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa 904-0495, Japan. “Department 
of Zoology, University of Oxford, Oxford OX1 3PS, UK. HudsonAlpha Institute of Biotechnology, Huntsville, Alabama 35806, USA. DNA Sequencing Section, Okinawa Institute of Science and 
Technology Graduate University, Onna, Okinawa 904-0495, Japan. 7Department of Molecular and Cell Biology, University of California, Berkeley California 94720-3200, USA. ®Department of 
Ecology and Evolutionary Biology, Rice University, Houston, Texas 77005, USA. Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan. !°Department of Systems 
Biology, Harvard Medical School, Boston, Massachusetts 02115, USA. '!Marine Biological Laboratory, Graduate School of Science, Hiroshima University, Onomichi, Hiroshima 722-0073, Japan. 
12Natural Science Center for Basic Research and Development, Gene Science Division, Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8527, Japan. ‘Marine Biological Association of 
he UK, The Laboratory, Citadel Hill, Plymouth PL1 2PB, UK. !4Department of Biology, Hopkins Marine Station, Stanford University, Pacific Grove, California 93950, USA. !5Départment de sciences 
biologiques, University of Montreal, Quebec H3C 3J7, Canada. !*University of North Caroline at Chapel Hill, North Carolina 27599, USA. '7Human Genome Sequencing Center, Department of 
Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, MS BCM226, Houston, Texas 77030, USA. !8Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, 
UK. !9Institute for Biogenesis Research, University of Hawaii, Hawaii 96822, USA. 2°National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan. 2!US Department of Energy Joint Genome 
nstitute, Walnut Creek, California 94598, USA. +Present addresses: University of Tsukuba, Tsukuba, Ibaraki 305-572, Japan (T.K.); Institute for Research on Cancer and Aging, Nice (IRCAN), CNRS 
MR 7284, INSERM U 1081, Nice 06107, France (J.-x.Y.); FAS Research Computing, Harvard University, Cambridge, Massachusetts 02138, USA (R.M.F.); Dovetail Genomics, Santa Cruz, California 


26 NOVEMBER 2015 | VOL 527 | NATURE | 459 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Figure 1 | Hemichordate model systems and their embryonic 
development. The hemichordate phylum includes the enteropneusts 
(acorn worms) and pterobranchs (minute, colonial, tube-dwelling; not 
shown). a, c, Saccoglossus kowalevskii (Harrimaniid (direct developing) 
enteropneust) adult (a) and juvenile (c) with gill slits. b, d, Ptychodera 


bodies comprising proboscis, collar and trunk, the last with tens to 
hundreds of pairs of gill slits. While S. kowalevskii develops directly to a 
juvenile worm with these traits within days (Fig. 1c, e), P flava develops 
indirectly through a feeding larva that metamorphoses to a juvenile 
worm after months in the plankton (Fig. 1d, e). Our analyses begin 
to integrate macroscopic information about morphology, organismal 
physiology, and descriptive embryology of these deuterostomes with 
genomic information about gene homologies, gene arrangements, gene 
novelties and non-coding elements. 


Genomes 

We sequenced the two acorn worm genomes by random shotgun meth- 
ods with a variety of read types (Methods; Supplementary Note 2), 
each starting from sperm from a single outbred diploid individual. The 
haploid lengths of the two genomes are both about 1 Gbp (Extended 
Data Fig. 1), but differ in nucleotide heterozygosity. Both acorn worm 
genomes were annotated using extensive transcriptome data as well 
as standard homology-based and de novo methods (Supplementary 
Note 3). Counting gene models with at least one detectable orthologue 
in another sequenced metazoan species, we find that Ptychodera and 
Saccoglossus encode at least 18,556 and 19,270 genes, respectively 
(Methods). Additional de novo gene predictions include divergent and/ 
or novel genes (Extended Data Fig. 1). Despite the ancient divergence 
of the Saccoglossus and Ptychodera lineages (more than 370 million 
years ago, see below) and their different modes of development, the 
two acorn worm genomes have similar bulk gene content, as discussed 
later (Extended Data Fig. 2 and Supplementary Note 4), and similar 
repetitive landscapes (Supplementary Note 5). 


Deuterostome phylogeny 

Deuterostome relationships were originally inferred from develop- 
mental and morphological characters”*!” and these hypotheses were 
later tested and refined with molecular data®’. Aspects of deuterostome 
phylogeny continue to be controversial, however, notably the position 
of the sessile pterobranchs among hemichordates, and the surprising 
association of Xenoturbella'® and acoelomorph flatworms with ambu- 
lacrarians’’ proposed by some studies. We explored these issues using 
genome-wide analyses of the newly sequenced hemichordate genomes 
augmented with extensive new RNAseq from five echinoderms, three 
additional hemichordates (including a rhabdopleurid pterobranch) and 
two acoels (Fig. 2, Extended Data Fig. 3, Methods and Supplementary 
Note 6). We recovered the monophyly of hemichordates, echino- 
derms, ambulacrarians and deuterostomes, using not only amino acid 


460 | NATURE | VOL 527 | 26 NOVEMBER 2015 


e  Ectoderm 
[ Endoderm 
ma Mesoderm 


N74 ‘©, t= Endomesoderm 
Late gastrula () Late gastrula 
Days) 
Gastrula ee 


Tornaria 
| Long pelagic period ¥ Tt 


| (months) 


< 

E.. 
Ptychodera flava 

(indirect-developing) 


Juvenile 


Blastula 


Saccoglossus kowalevskii 
(direct-developing) 


flava (Ptychoderid (indirect developing) enteropneust) adult (b) and 


the tornaria stage larva (d). Gill slits labelled with an asterisk in a and b. 

e, Comparison of the direct and indirect modes of development of the two 
hemichordates, indicating the long pelagic larval period in Ptychodera 
until the settlement and metamorphosis as a juvenile. 


characters but also presence—absence characters for introns and coding 
indels (Supplementary Note 4). Our analyses also placed pterobranch 
hemichordates as the sister-group to enteropneusts’ rather than within 
them”. These phylogenetic analyses imply that genomic traits shared 
by chordates and ambulacrarians can be attributed to the last common 
deuterostome ancestor (see below). Using a relaxed molecular clock, 
we estimate a Cambrian origin of hemichordates (Methods, Extended 
Data Fig. 3 and Supplementary Note 6). 

We also performed several analyses to assess the controversial rela- 
tionships between Xenoturbella, acoelomorphs and deuterostomes 
(Supplementary Note 6). With conventional site-homogeneous mod- 
els, acoels remain outside deuterostomes?°-?3 (Fig. 2, Supplementary 
Figs 6.1 and 6.2). Alternative models?*, however, show equivo- 
cal branching of acoels depending on the inclusion of the current 
sparse data for Xenoturbella (Supplementary Note 6). Notably, with- 
out Xenoturbella, acoels are positioned as a bilaterian sister group 
(Supplementary Fig. 6.3)”, Although we cannot rule out a deuteros- 
tome placement for Xenoturbella, our analyses generally do not support 


a grouping of acoels with deuterostomes’”. 


The gene set of the deuterostome ancestor 

By comparative analysis, we identified 8,716 families of homologous 
genes whose distributions in sequenced extant genomes imply their 
presence in the deuterostome ancestor (Methods; Supplementary Note 
4). Owing to gene duplication and other processes the descendants of 
these ancestral genes account for ~14,000 genes in extant deuteros- 
tome genomes including human (Supplementary Table 4.1.2). The dis- 
tributions of gene functions, domain compositions, and gene family 
sizes of hemichordates resemble those of amphioxus, sea urchin, and 
sequenced lophotrochozoans more than those of ecdysozoans; verte- 
brates also form a distinct group (Extended Data Fig. 2, Supplementary 
Note 4 and Supplementary Fig. 4.2). 

Exon-intron structures of genes are generally well conserved 
among hemichordates, chordates, and many non-deuterostome meta- 
zoans, allowing us to infer 2,061 ancestral deuterostome splice sites 
(Supplementary Note 4). Among orthologous bilaterian genes we found 
23 introns and 4 coding sequence indels present only in deuterostomes 
(shared between at least one ambulacrarian and chordate), suggesting 
that these shared derived characters may be useful to diagnose clade 
membership of new candidate organisms (Supplementary Note 4). 

Based on whole-genome alignments, we identified 6,533 con- 
served non-coding elements (CNE) longer than 50 bp that are found 
in all of the five deuterostomes Saccoglossus, Ptychodera, amphioxus, 


© 2015 Macmillan Publishers Limited. All rights reserved 


Cephalothrix linearis 


| 


0.07 
Capitella teleta 
Alvinella pompejana 


Crassostrea gigas 
Pinctada fucata 
Lottia gigantea 
Chaetopleura apiculata 


ibulanus polymorphus 


ARTICLE 


ieee 


Strigamia maritima 


Drosophila melanogaster 


Daphnia pulex 


Homo sapiens 

Mus musculus 

Gallus gallus 
Xenopus tropicalis 
Latimeria chalumnae 


Petromyzon marinus 


Priapulus caudatus 


——_——eies 


Asymmetron lucayanum 
Branchiostoma floridae 


Patiria miniata 


d | ee kowalevskii 


Strongylocentrotus purpuratus 
Scaphechinus mirabilis 
Parastichopus parvimensis 
Amphipholis sp. 


Florometra serratissima 
Ptychodera flava 


Schizocardium californicum 


Rhabdopleura compacta 


rc 
ie} 
= ge) 
wr |e 
ay 
8 
rome} 
RQ 
= 90 
Helobdella robusta » 3 
Stylochoplana maculata rH 
m 
Q 
-”" AS) S 
< 
f S 
Romanomermis culicivorax 8 
$y) 
Vertebrata 
(@) 
Eptatretus burgeri SS a 
ro 
Halocynthia roretzi Urochordata o oO 
Botryllus schlosseri S » 0 
Molgula tectiformis cS 
Ciona intestinalis @Q 
Cephalochordata ° 
ae 3 
Echinodermata 3 
o_o 
« fy 3 
he <7 > 4 2 
aw. 2 2 
; Hemichordata | 
Balanoglossus clavigerus Enteropneusta 5. 
D 


~ 


| Pterobranchia 


atisigdy wainei's e cided ¢ 3a Xenoturbella — 


Hofstenia miamia Acoela 


——- Nematostella vectensis 
Aurelia aurita 


Oscarella carmella 


Figure 2 | Phylogenetic placement of deuterostome taxa within the 
metazoan tree. Maximum-likelihood tree obtained with a super-matrix of 
506,428 amino-acid residues gathered from 1,564 orthologous genes in 52 
species (65.1% occupancy) and using a LG+T model partitioned for each 


sea urchin, and human (Methods; Supplementary Note 8). The iden- 
tified CNEs overlap extensively with human long non-coding RNAs 
(3,611 CNE loci; 55%, Fisher’s exact test P value < 2.2 x 101°). Those 
alignments usually do not exceed 250 bp (as has been reported among 
vertebrates’) and occur in clusters (Supplementary Note 8). Among 
these conserved sequences is a previously identified vertebrate brain 
and neural tube specific enhancer, located close to the sox14/21 ortho- 
logue in all five species”®. 


Conserved gene linkage 

Ancient gene linkages (‘macro-synteny””’) are often preserved in extant 
bilaterian genomes’”**. Comparative analysis revealed 17 ancestral 
linkage groups across chordates, including amphioxus and Ciona”’. 
While the contiguity of the draft of the sea urchin genome assembly”? 
is too limited to determine whether it shares this chromosome-scale 
organization, we find that the Saccoglossus genome clearly shares these 
chordate-defined linkage groups (Fig. 3a and Supplementary Note 7), 
implying that these chromosome-scale linkages were also present in 
the ancestral deuterostome. 

Ona more local scale, we find hundreds of tightly linked conserved 
gene clusters of three or more genes (‘micro-synteny’; Methods; 
Supplementary Note 7) including Hox*’ and ParaHox’! clusters in both 
acorn worms (Extended Data Fig. 4), as also found in echinoderms***?, 
Saccoglossus and amphioxus share more micro-syntenic linkages with 
each other than either does with sea urchin, vertebrates, or available pro- 
tostome genomes (Methods, Fig. 3b and Extended Data Figs 5 and 6). 
Conservation of micro-syntenic linkages can occur due to low rates of 
genomic rearrangement or, more interestingly, as a result of selection 


SSS eae 
Acropora digitifera 


Montastraea faveolata 


Praesagittifera ee 


Cnidaria 1? 


Placozoa 


ay 
Porifera 


gene. Filled circles at nodes denote maximal bootstrap support. 

Taxa highlighted in bold are newly sequenced genomes and 
transcriptomes introduced in this study. Bar indicates the number of 
substitutions per site. 


Hydra magnipapillata 
Trichoplax adherens 

Ephidatia muelleri 

Amphimedon queenslandica 


to retain linkages between genes and their regulatory elements located 
in neighbouring genes”*® 


A deuterostome pharyngeal gene cluster 
One conserved deuterostome-specific micro-syntenic cluster with 
functional implications for deuterostome biology is a cluster of genes 
expressed in the pharyngeal slits and surrounding pharyngeal endo- 
derm (Fig. 4; Supplementary Note 9). This six-gene cluster contains 
four transcription factor genes in the order nkx2.1, nkx2.2, pax1/9 
and foxA, along with two non-transcription-factor genes slc25A21 
and mipol1, whose introns harbour regulatory elements for pax 1/9 and 
foxA, respectively*+*°, The cluster was first found conserved across 
vertebrates including humans (see chromosome 14; 1.1 Mb length from 
nkx2.1 to foxA1)***”, In S. kowalevskii, it is intact with the same gene 
order as in vertebrates (0.5 Mb length from nkx2. 1 to foxA), imply- 
ing that it was present in the deuterostome and ambulacrarian ances- 
tors. The full ordered gene cluster also exists on a single scaffold in 
the crown-of-thorns sea star Acanthaster planci. Since these genes are 
not clustered in available protostome genomes, there is no evidence 
for deeper bilaterian ancestry. Two non-coding elements that are con- 
served across vertebrates and amphioxus*® are found in the hemichor- 
date and A. planci clusters at similar locations (A2 and A4, in Fig. 4a). 
The pax 1/9 gene, at the centre of the cluster, is expressed in the phar- 
yngeal endodermal primordium of the gill slit in hemichordates, tuni- 
cates, amphioxus, fish, and amphibians*”, and in the branchial pouch 
endoderm of amniotes (which do not complete the last steps of gill slit 
formation), as well as other locations in vertebrates. The nkx2.1 (thy- 
roid transcription factor 1) gene is also expressed in the hemichordate 


26 NOVEMBER 2015 | VOL 527 | NATURE | 461 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Saccoglossus 
X 
f=] 
i=] 
Oo 


3,000 


2,000 
Amphioxus 


ie) ai 


Drosophila melanogaster 


Tribolium castaneum 
Daphnia pulex 

Ixodes scapularis 
Caenorhabditis elegans 
Helobdella robusta 

Capitella teleta 

Lottia gigantea 

Pinctada fucata 


Crassostrea gigas 
Se Homo sapiens 
Mus musculus 
Gallus gallus 
Xenopus tropicalis 
Danio rerio 
Branchiostoma floridae 


Strongylocentrotus purpuratus 
Ptychodera flava 


Saccoglossus kowalevskii 


Nematostella vectensis 


0.02 


Figure 3 | High level of linkage conservation in Saccoglossus. 

a, Macro-synteny dot plot between Saccoglossus and amphioxus; each dot 
represents two orthologous genes linked in the two species, and ordered 
according to their macro-syntenic linkage. Amphioxus scaffolds are 
organized according to the 17 ancestral linkage groups (ALGs) inferred by 
comparison of the amphioxus and vertebrate genomes”’. Intersection areas 
of highest dot density are marked by numbers along the top of the plot, 
identifying each of the 17 putative ALGs. Axes represent orthologous gene 
group index along the genome. b, Branch-length estimation for loss and 
gain of synteny blocks with MrBayes, see Supplementary Note 7 for details. 
Short branches in hemichordates (in bold) indicate a high level of 
micro-syntenic retention in their genomes. 


pharyngeal endoderm in a band passing through the gill slit, but not 
localized to a thyroid-like organ*?. Here we also examined the expres- 
sion of nkx2.2 and foxA in S. kowalevskii. We find that nkx2.2, which 
is expressed in the ventral hindbrain in vertebrates, is expressed in 
pharyngeal ventral endoderm in S. kowalevskii, close to the gill slit 
(Fig. 4b), and that foxA is expressed throughout endoderm but 
repressed in the gill slit region (Fig. 4b). The co-expression of this 
ordered cluster of the four transcription factors during pharyngeal 
development strongly supports the functional importance of their 
genomic clustering. 

The presence of this cluster in the crown-of-thorns sea star, an 
echinoderm that lacks gill pores, and in amniote vertebrates that lack 
gill slits, suggests that the cluster’s ancestral role was in pharyngeal 
apparatus patterning as a whole, of which overt slits (perforations of 


462 | NATURE | VOL 527 | 26 NOVEMBER 2015 


a 
H. sapiens —E--E EL Fi Eo) (Eo Hn Ek re 14 
beet > ss? ee ia ase ad Ss 
EGLN3 NKX2.1 NKX2.8 PAX9 SLC25A21 MIPOL1 FOXA1 DHRS7 
AQ AS 
a a 
NKX2.4 NKX2.2 PAX1 SLC25A6 FOXA2 
a A2 
S. kowalevskii —§l (3 FE 3 HEF EE Scat. 18 (2.7 Mb) 
real ae eo mel ——- ——.- ee 
“a, dhrs7 mnkx2.1 nkx2.2— msxix cngala_cngatb_ paxt/9 
~---1 BACRS8@23 
Scat. 177 (0.86 Mb) {{a), a+ a 
<= —— ae edi 
silc25a21a-c_ mipoll foxal 
P. flava 
HE HE cat 9190 (0.65 mb) 
ae ——~ 
ae & nkx2.1 se msxix cngat_— pax1/9 
Scat. 1171479 (0.02 Mb) +a Scat. 1424 (0.61 Mb) 
slc25a21 mipol1 — foxat 
B floras gy ny ey ey ee ee 
am dhs? ~—egin3_—sdhrs7_—paxt/9. sic25a21__egin3. foxat/2a foxat/2b mipolt 
Seah SEH Seat. 186 (1.7 Mb) 
Sree > 
nkx2.1 nkx2.2 
‘ A2 
A. planci = gg} HE} HE Mg ME HE scat 88 (1.9 mb) 
= al <= a ad =... a al 
nkx2.1— nkx2.2. msxix—cngal_—pax1/9_ sic25a21__mipol1 —_foxat/2 


Scaf. 4241 (1.4 Mb) 


L. gigantea =a scat. 4210 (1.5 Mb) AEH Seat. 4268 (2.900) 
| 


a ee) a 
J Scat. 4267 (2.4 Mb) nkx2.2 msxix egin3—mipol1_foxa1/2b 


Nkx2.1 to FoxA > 5 Mb 


Figure 4 | Conservation of a pharyngeal gene cluster across 
deuterostomes. a, Linkage and order of six genes including the four genes 
encoding transcription factors Nkx2.1, Nkx2.2, Pax1/9 and FoxA, and two 
genes encoding non-transcription factors Slc25A21 (solute transporter) 
and Mipoll (mirror-image polydactyly 1 protein), which are putative 
‘bystander’ genes containing regulatory elements of pax 1/9 and foxA, 
respectively. The pairings of slc25A21 with pax1/9 and of mipoll with foxA 
occur also in protostomes, indicating bilaterian ancestry. The cluster is 
not present in protostomes such as Lottia (Lophotrochozoa), Drosophila 
melanogaster, Caenorhabditis elegans (Ecdysozoa), or in the cnidarian, 
Nematostella. SLC25A6 (the slc25A21 paralogue on human chromosome 
20) is a potential pseudogene. The dots marking A2 and A4 indicate two 
conserved non-coding sequences first recognized in vertebrates and 
amphioxus”, also present in S. kowalevskii and, partially, in P flava and 
A. planci. b, The four transcription factor genes of the cluster are expressed 
in the pharyngeal/foregut endoderm of the Saccoglossus juvenile: nkx2.1 

is expressed in a band of endoderm at the level of the forming gill pore, 
especially ventral and posterior to it (arrow), and in a separate ectodermal 
domain in the proboscis. It is also known as thyroid transcription factor 

1 due to its expression in the pharyngeal thyroid rudiment in vertebrates. 
The nkx2.2 gene is expressed in pharyngeal endoderm just ventral to 

the forming gill pore, shown in side view (arrow indicates gill pore) and 
ventral view; and pax1/9 is expressed in the gill pore rudiment itself. In 

S. kowalevskii, this is its only expression domain, whereas in vertebrates 

it is also expressed in axial mesoderm. The foxA gene is expressed widely 
in endoderm but is repressed at the site of gill pore formation (arrow). An 
external view of gill pores is shown; up to 100 bilateral pairs are present in 
adults, indicative of the large size of the pharynx. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


UDP-N-acetylglucosamine 


CMP-sialic acids + glycoproteins 


GNE CMP-Sialic ST6Gal Golgi 
transporter ST6GalINAc apparatus 
(ManNAc) ST8Sia 
ATP CMP-sialic acids 
GNE ————— y 
aie CMA Sialyl-glycoproteins Ce 
ManNAc-6-P | + CMP 
CMP-NeuSAc —— > CMP-Neu5Gc 
PEP A 
NANS 
NeuSAc-6-P CMP-NeuSAc 
tal NANP ss CMAS 
NeusAc NeuSAc 
N-acetylneuraminate 
(sialic acid). e e 


aa 


NST ae as Sialic acid Sialic acid 
NouA ialic aci transporter 
jeuSAc 
ManN&Ac + pyruvate <— lyase Lysosome Extracellular 
space 
b c 


@> 


Lefty:Nodal dim 


Nodal:Nodal 


er 


GDF1:GDF1 
dimer. Sk 


Outside 


dimer 
pp Poor binding? 


Cripto 
co-receptor 


Nodal:GDF1 


x“ dimer 


Inactive TGFR2 dimer 
TSP1 domains 


Inside 


Inactive Smad2/3 


No transcriptional 
activation 
or repression 


Transcriptional 
activation 
and repression 


Nucleus Nucleus 


Figure 5 | Examples of deuterostome gene novelties. a, Steps of 
biosynthesis of sialic acid and its addition to and removal from 
glycoproteins. b—d, Novel genes in TGF@ signalling pathways. The 
encoded proteins are shown and include Lefty (b), an antagonist of Nodal 
signalling, which activates Smad2/3-dependent transcription when not 
antagonized; Univin (c), an agonist of Nodal signalling, also called Vg1, 
DVRI, and GDF1; and TGF#2 (d), a ligand that activates Smad2/3- 
dependent transcription by binding to a deuterostome-specific TGF8 


apposed endoderm and ectoderm) were but one part, and the cluster is 
retained in these cases because of its continuing contribution to phar- 
ynx development. Genomic regions of the pharyngeal cluster have been 
implicated in long-range promoter-enhancer interactions, support- 
ing the regulatory importance of this gene linkage (see Supplementary 
Note 9)*°. Alternatively, genome rearrangement in these lineages may 
be too slow to disrupt the cluster even without functional constraint. 
Here we propose that the clustering of the four ordered transcription 
factors, and their bystander genes, on the deuterostome stem served a 
regulatory role in the evolution of the pharyngeal apparatus, the fore- 
most morphological innovation of deuterostomes. 


Deuterostome novelties 

We found >30 deuterostome genes with sequences that differ mark- 
edly from those of other metazoans, related to functional innovation 
in deuterostomes. Some plausibly arose from accelerated sequence 
change on the deuterostome stem from distant but identifiable bila- 
terian homologues, others represent new protein domain combina- 
tions in deuterostomes, while others lack identifiable sequence and 
domain homologues in other animals. In the latter group, we found 
over a dozen deuterostome genes that have readily identified relatives 
in marine microbes, often cyanobacteria or eukaryotic micro-algae, 
but are not known in other metazoans (Extended Data Table 1 and 
Extended Data Fig. 7; Supplementary Notes 10.4 and 10.5). Such genes 


Active kinase of 
Smad2/3 


"4 Thrombospondin1 


Inhibitory 
fragments 
removed 


Se. | 


Activated 
Smad2/3 
+ 


receptor type II, which contains a novel ectodomain (not shown). Also 
shown in d is the novel protein thrombospondin 1 that activates TGF32 
by releasing it from an inactive complex, by way of its TSP1 domains. Red 
boxes around protein names indicate their deuterostome novelty. Green 
boxes around the names indicate genes with pan-metazoan/bilaterian 
ancestry and without accelerated sequence change in the deuterostome 
lineage. 


Active TGFB2 dimer 
binds receptor, 
activates Smad2/3 


include two of the novel deuterostome sequences associated with sialic 
acid metabolism (found in many microbes“', see below), enzymes that 
modify proteins (for example, protein arginine deiminase) and RNA 
(for example, FATSO methyladenosine demethylase) as well as others 
that provide specialized reactions of secondary metabolism (Extended 
Data Table 1 and Extended Data Fig. 7; Supplementary Note 10.5). 
Possible explanations for the unusual phylogenetic distribution of these 
genes include horizontal transfer on the deuterostome stem from early 
marine microbes (which were plausibly commensals, pathogens, or 
food sources of stem deuterostomes), or convergent gene loss and/or 
extensive sequence divergence along five or more opisthokont lineages 
(Supplementary Note 10.2). 

Regardless of their mechanism of origination, the various deuteros- 
tome novelties and gene family expansions of sialic acid metabolism are 
noteworthy. Deuterostomes are unique among metazoans in their high 
level and diverse linkage of addition of sialic acid (also known as neu- 
raminic acid), a nine carbon negatively charged sugar, to the terminal 
sugars of glycoproteins, mucins and glycolipids””. We find expanded 
families of enzymes for several of these reactions in hemichordates 
(Fig. 5a and Extended Data Table 1). Based on the presence/absence 
of relevant enzymes we infer that 5 of the 11 steps of the pathways of 
sialic acid formation, addition to termini, and removal are not found 
in protostomes or other metazoans, and are deuterostome novelties 
(Fig. 5a and Supplementary Note 10), whereas the other steps use 


26 NOVEMBER 2015 | VOL 527 | NATURE | 463 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


enzymes similar to those of the more limited pathway of some protos- 
tomes (for example, insects such as Drosophila). 

The importance of glycoproteins for muco-ciliary feeding and other 
hemichordate activities is further supported by novel and expanded 
families of genes encoding the polypeptide backbones of glycopro- 
teins, those with von Willebrand type-D and/or cysteine-rich domains 
(PTHR11339 classifier), including mucins, present in hemichordates 
and amphioxus as large tandemly duplicated clusters (with varied 
expression patterns as shown in Extended Data Fig. 8), but not in sea 
urchin, which has a different mode of feeding (Supplementary Note 
10). As in amphioxus, the pharynx of Saccoglossus is heavily ciliated***, 
and cells of the pharyngeal walls in hemichordates and the ventral 
endostyle in amphioxus secrete abundant mucins and glycoproteins”®. 
Similarly, in the deuterostome ancestor these glycoproteins probably 
enhanced the muco-ciliary filter-feeding capture of food particles from 
the microbe-rich marine environment and protected its inner and outer 
tissue surfaces. 


Novelty in the TGF6 signalling pathway 

The signalling ligands Lefty (a Nodal antagonist) and Univin/Vg1/ 
GDF1” (a Nodal agonist) are deuterostome innovations that modulate 
Nodal signalling during the major developmental events of endomeso- 
derm induction and axial patterning in vertebrates, axial patterning 
in hemichordates and echinoderms, and left-right patterning in all 
deuterostomes*® (see Fig. 5b-d and Extended Data Fig. 9a, b). Univin 
is tightly linked to the related bilaterian bmp2/4 in the sea urchin 
genome” and also, we now report, in hemichordates and amphioxus, 
supporting its origin by tandem duplication and divergence from an 
ancestral bmp2/4-type gene, as suggested previously”. 

TGF@2 signalling (TGF@1, 2 and 3 in vertebrates) is a deuterostome 
innovation that controls cell growth, proliferation, differentiation 
and apoptosis at later developmental stages. Accompanying the novel 
TGFQ2 ligand, the type II receptor has a novel ectodomain. The extra- 
cellular matrix protein thrombospondin 1, which activates TGF82 in 
vertebrates, contains a deuterostome-unique combination of domains 
including three thrombospondin type 1 (TSP1) domains that bind 
the TGF32 pro-domain region. While these signalling novelties have 
clear sequence similarity to pan-bilaterian components, they form 
long stem branch clades on the phylogenetic trees, indicating exten- 
sive sequence divergence on the deuterostome stem (Supplementary 
Note 10). Together, these innovations appear to contribute to the 
increased amount and complex patterning of Smad2/3-mediated 
signalling in deuterostomes compared with protostomes and other 
metazoans. 


Conclusion 

The two acorn worms whose genomes are described here represent 
the two main enteropneust lineages, separated by at least 370 million 
years and differing in their developmental modes. These analyses 
reveal (1) extensive conserved macro-synteny among deuterostomes; 
(2) a widely conserved deuterostome-specific cluster of six ordered 
genes, including four transcription factor genes that are expressed 
during the development of pharyngeal gill slits and the branchial 
apparatus, the most prominent morphological innovation of the deu- 
terostome ancestor; and (3) numerous gene novelties shared among 
deuterostomes, many expanded into large families, with putative pro- 
tein functions that imply physiological, metabolic and developmental 
specializations of the filter-feeding deuterostome ancestor. Some of 
these genes lack identifiable orthologues in other metazoans but do 
resemble microbial sequences and domain types. In addition to their 
contributions towards defining the deuterostome ancestor and illu- 
minating chordate origins, the two genomes should inform hypoth- 
eses of larval evolution by providing a basis for future comparisons 
of direct-developing and indirect-developing acorn worms, which 
achieve remarkably similar adult forms by distinct embryological 
routes (Fig. 1). 


464 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 15 July; accepted 13 October 2015. 
Published online 18 November 2015. 


1. Bateson, W. The later stages in the development of Balanoglossus Kowalevskii, 
with a suggestion as to the affinities of the Enteropneusta. 2 parts. 
Q. J. Microsc. Sci. 25, 81-122 (1885). 

2. Bateson, W. Memoirs: the ancestry of the Chordata. Q. J. Microsc. Sci. 2, 
535-572 (1886). 

3. Kovalevskij, A. O. Anatomie des Balanoglossus delle Chiaje (Mémoires de 
l'Académie Impériale des Sciences de St. Pétersbourg: Imperatorskaja 
Akademija Nauk, 1866). 

4. Agassiz, A. The history of Balanoglossus and tornaria. Memoirs of the American 
Academy of Arts and Sciences 9, 421-436 (1873). 

5. Metschnikoff, V. Uber die systematische Stellung von Balanoglossus. Zool. Anz. 
4, 139-157 (1881). 

6. Halanych, K. M. The phylogenetic position of the pterobranch 
hemichordates based on 18S rDNA sequence data. Mol. Phylogenet. 

Evol. 4, 72-76 (1995). 

7. Cannon, J. T. et al. Phylogenomic resolution of the hemichordate and 
echinoderm clade. Curr. Biol. 24, 2827-2832 (2014). 

8. Ogasawara, M., Wada, H., Peters, H. & Satoh, N. Developmental expression of 
Pax1/9 genes in urochordate and hemichordate gills: insight into function 
and evolution of the pharyngeal epithelium. Development 126, 2539-2550 
(1999). 

9. Gillis, J. A., Fritzenwanker, J. H. & Lowe, C. J. A stem-deuterostome origin of the 
vertebrate pharyngeal transcriptional network. Proc. R. Soc. Lond. B 279, 
237-246 (2012). 

10. Lowe, C. J., Clarke, D. N., Medeiros, D. M., Rokhsar, D. S. & Gerhart, J. 

The deuterostome context of chordate origins. Nature 520, 456-465 
(2015). 
11. Swalla, B. J. & Smith, A. B. Deciphering deuterostome phylogeny: molecular, 
morphological and palaeontological perspectives. Phil. Trans. R. Soc. Lond. B 
363, 1557-1568 (2008). 
12. Cameron, C. B., Garey, J. R. & Swalla, B. J. Evolution of the chordate body plan: 
new insights from phylogenetic analyses of deuterostome phyla. Proc. Nat! 
Acad. Sci. USA 97, 4469-4474 (2000). 
13. Gerhart, J., Lowe, C. & Kirschner, M. Hemichordates and the origin of 
chordates. Curr. Opin. Genet. Dev. 15, 461-467 (2005). 

4. Gonzalez, P. & Cameron, C. B. The gill slits and pre-oral ciliary organ of 
Protoglossus (Hemichordata: Enteropneusta) are filter-feeding structures. 
Biol. J. Linn. Soc. 98, 898-906 (2009). 

15. Brown, F. D., Prendergast, A. & Swalla, B. J. Man is but a worm: chordate 

origins. Genesis 46, 605-613 (2008). 

16. Holland, N. D., Holland, L. Z. & Holland, P. W. Scenarios for the making of 

vertebrates. Nature 520, 450-455 (2015). 

17. Hyman, L. H. The invertebrates: smaller coelomate groups chaetognatha, 

hemichordata, pogonophora, phoronida, ectoprocta, brachipoda, sipunculida, 
the coelomate bilateria Vol. 5. (McGraw-Hill, 1959). 

8. Bourlat, S. J. et al. Deuterostome phylogeny reveals monophyletic chordates 

and the new phylum Xenoturbellida. Nature 444, 85-88 (2006). 

19. Philippe, H. et a/. Acoelomorph flatworms are deuterostomes related to 

Xenoturbella. Nature 470, 255-258 (2011). 

20. Ruiz-Trillo, |., Riutort, M., Fourcade, H. M., Baguna, J. & Boore, J. L. 

itochondrial genome data support the basal position of Acoelomorpha and 

he polyphyly of the Platyhelminthes. Mo/. Phylogenet. Evol. 33, 321-332 

(2004). 

21. Hejnol, A. et al. Assessing the root of bilaterian animals with scalable 

phylogenomic methods. Proc. R. Soc. Lond. B 276, 4261-4270 
(2009). 

22. Edgecombe, G. D. et al. Higher-level metazoan relationships: recent progress 
and remaining questions. Org. Divers. Evol. 11, 151-172 (2011). 

23. Srivastava, M., Mazza-Curll, K. L., van Wolfswinkel, J. C. & Reddien, P. W. 
Whole-body acoel regeneration is controlled by Wnt and Bmp-Admp signaling. 
Curr. Biol. 24, 1107-1113 (2014). 

24. Lartillot, N., Lepage, T. & Blanquart, S. PhyloBayes 3: a Bayesian software 
package for phylogenetic reconstruction and molecular dating. Bioinformatics 
25, 2286-2288 (2009). 

25. Ulitsky, |, Shkumatava, A., Jan, C. H., Sive, H. & Bartel, D. P. Conserved function 
of lincRNAs in vertebrate embryonic development despite rapid sequence 
evolution. Cell 147, 1537-1550 (2011). 

26. Royo, J. L. et a/. Transphyletic conservation of developmental regulatory 
state in animal evolution. Proc. Nat! Acad. Sci. USA 108, 14186-14191 
(2011). 

27. Putnam, N. H. et al. The amphioxus genome and the evolution of the chordate 
karyotype. Nature 453, 1064-1071 (2008). 

28. lrimia, M. et a/. Extensive conservation of ancient microsynteny across 
metazoans due to cis-regulatory constraints. Genome Res. 22, 2356-2367 
(2012). 

29. Sodergren, E. et al. The genome of the sea urchin Strongylocentrotus purpuratus. 
Science 314, 941-952 (2006). 

30. Freeman, R. et al. Identical genomic organization of two hemichordate hox 
clusters. Curr. Biol. 22, 2053-2058 (2012). 


© 2015 Macmillan Publishers Limited. All rights reserved 


31. 


32: 


33. 


34. 


35. 
36. 


37. 


38. 


39. 


40. 


41. 


42. 


43. 


44. 


45. 


46. 


47. 


48. 


Ikuta, T. et al. Identification of an intact ParaHox cluster with temporal 
colinearity but altered spatial colinearity in the hemichordate Ptychodera flava. 
BMC Evol. Biol. 13, 129 (2013). 

Cameron, R. A. et a/. Unusual gene order and organization of the sea 

urchin hox cluster. J. Exp. Zoolog. B Mol. Dev. Evol. 306, 45-58 

(2006). 

Baughman, K. W. et al. Genomic organization of Hox and ParaHox 

clusters in the echinoderm, Acanthaster planci. Genesis 52, 952-958 

(2014). 

Santagati, F. et al. Identification of cis-regulatory elements in the mouse 
Pax9/Nkx2-9 genomic region: implication for evolutionary conserved synteny. 
Genetics 165, 235-242 (2003). 

Lowe, C. J. et a/. Dorsoventral patterning in hemichordates: insights into early 
chordate evolution. PLoS Biol. 4, e291 (2006). 

Wang, W., Zhong, J., Su, B., Zhou, Y. & Wang, Y. Q. Comparison of Pax1/9 locus 
reveals 500-Myr-old syntenic block and evolutionary conserved noncoding 
regions. Mol. Biol. Evol. 24, 784-791 (2007). 

Santagati, F. et al. Comparative analysis of the genomic organization of Pax9 
and its conserved physical association with Nkx2-9 in the human, mouse, and 
pufferfish genomes. Mamm. Genome 12, 232-237 (2001). 

Wang, S., Zhang, S., Zhao, B. & Lun, L. Up-regulation of C/EBP by thyroid 
hormones: a case demonstrating the vertebrate-like thyroid hormone 
signaling pathway in amphioxus. Mol. Cell. Endocrinol. 313, 57-63 

(2009). 

Lowe, C. J. et a/. Anteroposterior patterning in hemichordates and the origins of 
the chordate nervous system. Ce// 113, 853-865 (2003). 

Kokubu, C. et al. A transposon-based chromosomal engineering method to 
survey a large cis-regulatory landscape in mice. Nature Genet. 41, 946-952 
(2009). 

Giacopuzzi, E., Bresciani, R., Schauer, R., Monti, E. & Borsani, G. New insights 
on the sialidase protein family revealed by a phylogenetic analysis in metazoa. 
PLoS ONE 7, e44193 (2012). 

Harduin-Lepers, A., Mollicone, R., Delannoy, P. & Oriol, R. The animal 
sialyltransferases and sialyltransferase-related genes: a phylogenetic approach. 
Glycobiology 15, 805-817 (2005). 

Harduin-Lepers, A. et al. Evolutionary history of the alpha2,8-sialyltransferase 
(ST8Sia) gene family: tandem duplications in early deuterostomes explain 
most of the diversity found in the vertebrate ST8Sia genes. BMC Evol. Biol. 8, 
258 (2008). 

Pardos, F. Fine structure and function of pharynx cilia in Glossobalanus minutus 
Kowalewsky (Enteropneusta). Acta Zoologica 69, 1-12 (1988). 

Kaul-Strehlow, S. & Stach, T. A detailed description of the development of the 
hemichordate Saccoglossus kowalevskii using SEM, TEM, Histology and 
3D-reconstructions. Front. Zool. 10, 53 (2013). 

Ruppert, E. E., Cameron, C. B. & Frick, J. E. Endostyle-like features of the dorsal 
epibranchial ridge of an enteropneust and the hypothesis of dorsal-ventral axis 
inversion in chordates. Invertebr. Biol. 118, 202-212 (1999). 

Range, R. & Lepage, T. Maternal Oct1/2 is required for Nodal and Vg1/Univin 
expression during dorsal-ventral axis specification in the sea urchin embryo. 
Dev. Biol. 357, 440-449 (2011). 

assagué, J. TGF8 signalling in context. Nature Rev. Mol. Cell Biol. 13, 616-630 
(2012). 


ARTICLE 


49. Range, R. et al. Cis-regulatory analysis of nodal and maternal control of 
dorsal-ventral axis formation by Univin, a TGF-8 related to Vg1. Development 
134, 3649-3664 (2007). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements The Ptychodera flava genome project was supported by 
MEXT and OIST, Japan. This research was supported by USPHS grant HD42724 
and NASA grant FDNAG2-1605 to J.G.; USPHS grant HD37277 to M.W.K.; 
NASA - NNX13AI68G to C.L. F.M. was funded by FP7/ERC grant [268513]. 
O.S. and D.S.R, and T.K. and N.S. were supported by the Molecular Genetics 
Unit and Marine Genomics Unit of the Okinawa Institute of Science and 
Technology Graduate University, respectively. Y.-H.S. and J.-K.Y. are supported 
by Academia Sinica and Ministry of Science and Technology, Taiwan. L.P. was 
supported by NIH grant RO1HDO73104. The Saccoglossus kowalevskii genome 
project was supported by a grant from the National Human Genome Research 
Institute, National Institutes of Health (U54 HGO003273) to R.A.G. 


Author Contributions J.Q., K.C.W. assembled the initial S. kowalevskii genomic 
assembly and performed quality assessments of the genome assemblies. 
Ptychodera collection, genome sequencing, and assembly: T.K., K.T., A.S., R.K., 
H.G., M.F., M.ILK., N.A., S.Y., A.F., T.H. Rhabdopleura collection: A.S. Saccoglossus 
RNA sequencing and analysis: R.M.F., M.W.K., R.C., C.L.K., S.LL, M.H., S.R., 
D.M.M., K.C.W. Genome sequence production: A.C., Y.D., H.H.D., S.D., M.H., 

S.NJ. CLK, S.LL, L-R.L., D.M., L-V.N., G.O., J.Sa., S.R., K.C.W., D.M.M., LP, BF, 
M.W.K. Saccoglossus sequence finishing: S.D., Y.D., D.M.M. Final Saccoglossus 
assembly: J.J., J.Sc. Saccoglossus gene modelling and validation: T.M., J.B., 
J.H.F., A.M.P., M.W. Ptychodera gene modelling and analyses: T.K., R.K., K.H., E.S., 
F.G., K.W.B., K.T,, O.S., J.G., N.S. Gene family analyses: O.S., T.K., F.M., LP, R.IM.F., 
CLL, J.G. Synteny: N.H.P.,, O.S., J.-X.Y. Repeats: O.S. Saccoglossus sequencing 

and assembly project management: S.R., D.M.M., K.C.W., R.A.G. Ptychodera 
expression analysis: Y.-C.C., Y.-H.S., J.-K.Y. Phylogenetic analyses: F.M. Additional 
EST collections: T.H.-K., K.T., A.S., A.T.S., J.P., P.G., C.C., C.L. HGT and novelties: J.G., 
O.S. Pharyngeal cluster analysis and expression: J.G., 0.S., N.S., K.B., A.G. Project 
coordination, manuscript writing: O.S., T.K., F.M., K.T., N.S., J.G., C.L, D.S.R. 


Author Information Sequencing data have been deposited in NCBI BioProject 
under accession number PRJNA12887 (Saccoglossus kowalevskii) and 

DDBJ under accession number PRJDB3182 (Ptychodera flava). Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare no competing financial interests. Readers are welcome to comment 
on the online version of the paper. Correspondence and requests for materials 
should be addressed to O.S. (oleg.simakov@oist.jp), J.G. (jgerhart@berkeley. 
edu), N.S. (norisky@oist.jp) and D.S.R. (dsrokhsar@gmail.com). 


C)OG© This work is licensed under a Creative Commons Attribution- 


NonCommercial-ShareAlike 3.0 Unported licence. The images or 
other third party material in this article are included in the article’s Creative 
Commons license, unless indicated otherwise in the credit line; if the material is 
not included under the Creative Commons license, users will need to obtain 
permission from the license holder to reproduce the material. To view a copy of 
this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ 


26 NOVEMBER 2015 | VOL 527 | NATURE | 465 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Sequencing. Sperm DNA from adult males was extracted for sequencing as 
described in Supplementary Note 2. A single male was used for each species to min- 
imize the impact of heterozygosity on assembly. For Saccoglossus, approximately 
eightfold redundant random shotgun coverage (totalling 8.1 Gb) was obtained with 
Sanger dideoxy sequencing at the Baylor College of Medicine Genome Center, 
including 34,279 BAC ends and 459,052 fosmid ends. For Ptychodera, 1.3 Gb in 
Sanger shotgun sequences, 15.3 Gb in Roche 454 pyrosequence reads, and 52-Gb 
paired-end sequences with Illumina MiSeq, along with mate-pairs, were generated 
at the Okinawa Institute of Science and Technology Graduate University. More 
sequencing details are available in Supplementary Note 2. 

Genome assemblies. We assembled the Saccoglossus genome with Arachne”, 
combined with BAC/fosmid pair information to produce the final assembly. 
This Saccoglossus assembly includes 7,282 total scaffold sequences spanning a 
total length of 758 Mb. The relatively modest nucleotide heterozygosity (0.5%) of 
S. kowalevskii, coupled with longer read lengths, enabled assembly of a single com- 
posite reference sequence. Half of the assembly is in scaffolds longer than 552 kb 
(the N50 scaffold length), and 82% of the assembled sequence is found in 1,602 
scaffolds longer than 100kb. For Ptychodera we used the Platanus®! assembler. 
The resulting total scaffold length was 1,229 Mb, with half the assembly in scaf- 
folds longer than 196 kb (N50 scaffold length). P flava exhibited a notably higher 
heterozygosity (1.3% single nucleotide heterozygosity with frequent indels) than 
S. kowalevskii, presumably related to its pelagic dispersal and larger effective pop- 
ulation size’. We therefore initially produced stringent separate assemblies of the 
two divergent haplotypes, and found that many scaffolds had a closely related 
second scaffold with ~94% BLASTN identity (over longer stretches, including 
indels). To avoid reporting both haplotypes at these loci, scaffolds with less than 
6% divergence over at least 75% of their length were merged into a single haploid 
reference for comparative analysis. To further classify regions with ‘double’ depth 
and single haplotype regions we implemented a Hidden Markov Model classi- 
fier. We find that at least 63% of the initial Platanus assembly constitutes merged 
haplotypes. The inferred SNP rate for those regions is 1.3%, while for the remaining 
haplotype regions it is below 0.1%. Further details of assemblies are described in 
Supplementary Note 2. 

Gene predictions. Transcriptome data for both species were used, along with 
homology-guided and ab initio methods, to predict protein-coding genes 
(Supplementary Note 3). For Saccoglossus, 8.6 million RNAseq reads were gener- 
ated from 7 adult tissues and 15 developmental stages using Roche 454 sequencing, 
along with previously deposited ESTs in GenBank. For Ptychodera, extensive EST 
data from egg, blastulae, gastrulae, larvae, juveniles, adult proboscis, stomochord, 
and gills defining 34,159 cDNA clones®, and 879,000 Roche/454 RNAseq reads 
from a mixed library of developmental stages™ were used. The Saccoglossus genome 
was annotated using JGI gene prediction pipeline®, while Augustus® was used to 
produce gene models for Ptychodera. We find a total of 34,239 gene predictions for 
Saccoglossus (68% with transcript evidence) and 34,687 for Ptychodera (43% with 
transcript evidence), although these are overestimates of the true gene number due 
to fragmented gene predictions, mis-annotated repetitive sequences, and spurious 
predictions. As described in the main text, 18-19,000 gene models in each species 
have known annotations and/or orthologues in other species. 

Gene family analysis. Gene family clustering was done using a progressive (leaf 
to root) BLASTP-based clustering algorithm, where at a given phylogenetic node 
the gene families are constructed taking into account protein similarities among 
ingroups and outgroups””. For the inference of deuterostome gene families we 
use the bilaterian node of the clustering. To call gene families present in the deu- 
terostome ancestor, we required (1) at least two ambulacrarian orthologues out of 
the three available ambulacrarian genomes and at least two chordate orthologues, 
or (2) at least two deuterostomes (chordates and/or ambulacrarians) and two 
outgroups in the bilaterian level clusters. 

Transposable elements. Repetitive sequences were identified using RepeatScou 
followed by manual curation and annotation using both a Repbase release (version 
20140131) and BLASTX-based search against a custom collection of transpos- 
ons, using a previously described repeat identification and annotation pipeline” 
(Supplementary Note 5). The assemblies were then masked with RepeatMasker 
version open-4.0.5°°. The repetitive complements of the two hemichordate 
genomes are summarized in Supplementary Table 5.1. 

Phylogenetic analysis. Phylogenetic analyses were done using metazoan-level 
gene family clusters based on whole-genome sequences (Supplementary Note 4), 
selecting a single orthologue per genome with the best cumulative BLASTP to 
other species, and best reciprocal BLASTP hits to species with transcriptome-only 


18, 


information (Supplementary Note 6). Single gene alignments were built using 
Muscle®! and filtered using Trimal® for each orthologue, and were concatenated, 
yielding a supermatrix of 506,428 positions with 34.9% missing data. This super- 
matrix was analysed with ExaML assuming a site-homogenous LG+I°4 model 
partitioned for each gene®. A slow-fast analysis was conducted to stratify marker 
genes based on the length of the branch leading to acoels in individual trees. 
A subset of the slowest 10% of genes was analysed with the site-heterogenous 
CAT+GTR+I4 model using Phylobayes”*. Molecular dating was carried out 
using Phylobayes”* using the log-normal relaxed clock model and the calibrations 
described in Supplementary Table 6.2. 

Synteny analysis. Macro- and micro-syntenic linkages were calculated as described 
in Supplementary Note 7. For Fig. 3a, we merged the amphioxus scaffolds into 17 
pre-defined scaffold groups as suggested in ref. 27. These 17 merged scaffold groups 
represent the 17 ancestral linkage groups (ALGs) shared in chordates. Then we cal- 
culated the orthologous gene groups shared by each amphioxus ALG-Saccoglossus 
scaffold pair and generated the dot plot as described in Supplementary Note 7. For 
micro-synteny we required at least three genes (separated by a maximum of ten 
genes) to be present in pairwise comparisons. Under random reshuffling of the 
genome, this yields 10% false positives in pairwise genome comparisons, that is, 
we observe approximately one-tenth as many micro-syntenic blocks between the 
two genomes when gene orders are shuffled. This false-positive rate, however, falls 
to 1% when considering more than two species. For our inference of deuterostome 
ancestral and novel synteny we therefore focus on blocks present in at least three 
species (and both ingroup representatives, that is, ambulacrarians and chordates). 
This yields 698 blocks that can be traced back to the deuterostome ancestor, includ- 
ing 71 blocks found exclusively in deuterostome species (shared among ambu- 
lacrarians and chordates), including the pharyngeal cluster discussed in Fig. 4. 
Whole-genome alignment. Whole-genome alignments were conducted with 
MEGABLAST® using parameters previously reported®. We assessed the dis- 
tribution of the resulting 12,722 aligned loci across known gene annotations in 
ENSEMBL*, previously identified conserved pan-vertebrate elements®, as well 
as known enhancers in human according to LBL database”. 

Gene novelties. Deuterostome gene novelties were assessed initially through bila- 
terian gene clusters (Supplementary Note 10) by requiring at least two species on 
both ambulacrarian and chordate side to be present. The novelties were further 
automatically subdivided into four categories: G1 (gain type I), with no BLASTP 
hit outside of deuterostomes; G2 (gain type II), with a novel PFAM domain pres- 
ent only in deuterostomes; G3 (gain type III) having a novel PFAM combination 
unique to deuterostomes; and G4 (gain type IV), those that do not fall under any 
of the G1-3 categories and define novelties due to acceleration in the substitution 
rate on the deuterostome stem. To confirm the novel nature, especially for G4 
novelties, we have constructed phylogenies for the members and non-deuterostome 
BLASTP hits (up to an e-value of 1 x 10~?°) using MAFFT-alignment-based 
FastTree calculations. The trees were assessed for the accelerated rate of evolution 
at the deuterostome stem (Supplementary Fig. 9.1.1). The final result is provided 
in the Supplementary Information. 

Curation of candidates for horizontal gene transfer on the deuterostome stem. 
We examined in detail gene families found broadly in deuterostomes whose 
encoded peptides were readily alignable to microbial sequences but had no detect- 
able similarity in non-deuterostome animals. Criteria for evaluation included: 
(1) the hemichordate gene matches microbial genes at least ten orders of magnitude 
in the e-value better than it matches sequences of non-deuterostome metazoans 
(most of the putative HGTs we describe have no non-deuterostome metazoan hit 
at all); (2) it has a defined genomic locus among bona fide metazoan genes; (3) it 
shares an exon-intron structure with genes of chordates and other ambulacraria; 
and (4) when a low bitscore match is found to a non-deuterostome metazoan 
sequence, that sequence is identified as containing different domains (domain 
structure according to CDD®) and/or different exon-intron structure, implying 
dubious relatedness. When phylogenetic trees are constructed for these HGT- 
candidate proteins, the trees contain numerous branches for microbial sequences 
and none for non-deuterostome metazoan sequences, or only very long branches 
for dubiously relatives, and hence the trees differ greatly from the metazoan species 
tree, except within the deuterostome clade. 

Code availability. Original data and code can be accessed at https://groups.oist. 
jp/molgenu. 


50. Jaffe, D. B. et al. Whole-genome sequence assembly for mammalian genomes: 
Arachne 2. Genome Res. 13, 91-96 (2003). 

51. Kajitani, R. et al. Efficient de novo assembly of highly heterozygous genomes 
from whole-genome shotgun short reads. Genome Res. 24, 1384-1395 
(2014). 

52. Romiguier, J. et al. Comparative population genomics in animals uncovers the 
determinants of genetic diversity. Nature 515, 261-263 (2014). 


© 2015 Macmillan Publishers Limited. All rights reserved 


53. 
54. 


55. 
56. 


57. 
58. 
59. 
60. 
61. 


Tagawa, K. et al. A cDNA resource for gene expression studies of a 
hemichordate, Ptychodera flava. Zoolog. Sci. 31, 414-420 (2014). 

Chen, S. H. et a/. Sequencing and analysis of the transcriptome of the acorn 
worm Ptychodera flava, an indirect developing hemichordate. Mar. Genomics 
15, 35-43 (2014). 

Salamov, A. A. & Solovyey, V. V. Ab initio gene finding in Drosophila genomic 
DNA. Genome Res. 10, 516-522 (2000). 

Stanke, M. & Waack, S. Gene prediction with a hidden Markov model 

and a new intron submodel. Bioinformatics 19 (Suppl. 2), ii215-ii225 
(2003). 

Simakoy, O. et a/. Insights into bilaterian evolution from three spiralian 
genomes. Nature (2013). 

Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families 
in large genomes. Bioinformatics 21 (Suppl. 1), 1351-1358 (2005). 

Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. 
Cytogenet. Genome Res. 110, 462-467 (2005). 

Smit, A., Hubley, R. & Green, P. RepeatMasker http://www.repeatmasker.org. 
(2007). 

Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced 
time and space complexity. BMC Bioinformatics 5, 113 (2004). 


62. 


63. 


64. 
65. 


66. 
67. 


68. 
69. 


ARTICLE 


Capella-Gutiérrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for 
automated alignment trimming in large-scale phylogenetic analyses. 
Bioinformatics 25, 1972-1973 (2009). 

Aberer, A. & Stamatakis, A. ExaML: Exascale maximum likelihood: program and 
documentation. See http://sco.h-its.org/exelixis/web/software/examl/index. 
html (2013). 

Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein 
database search programs. Nucleic Acids Res. 25, 3389-3402 (1997). 

Lee, A. P., Kerk, S. Y., Tan, Y. Y., Brenner, S. & Venkatesh, B. Ancient vertebrate 
conserved noncoding elements have been evolving rapidly in teleost fishes. 
Mol. Biol. Evol. 28, 1205-1215 (2011). 

Cunningham, F. et al. Ensembl! 2015. Nucleic Acids Res. 43, D662-D669 (2015). 
Visel, A., Minovitsky, S., Dubchak, |. & Pennacchio, L. A. VISTA Enhancer 
Browser-a database of tissue-specific human enhancers. Nucleic Acids Res. 
35, D88-D92 (2007). 

Marchler-Bauer, A. et a/. CDD: NCBI’s conserved domain database. Nucleic 
Acids Res. 43, D222-D226 (2015). 

Marinié, M., Aktas, T., Ruf, S. & Spitz, F. An integrated holo-enhancer unit 
defines tissue and gene specificity of the Fgf8 regulatory landscape. Dev. Cell 
24, 530-542 (2013). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
Saccoglossus Ptychodera 
Scaffold total 7,282 218,255 
Contig total 20,913 322,077 
Scaffold sequence total, Mb 758 1,229 
Scaffold N50, kb 552 196 
Gene models 34,239 34,687 
SNP rate, % 0.5 1.3 (2x ), 0.06 (1x) 
b c 
@ poisson @ poisson 
™ geometric © @ geometric 
e 
S | * 
2 $5 
SJ 
oo 8 | 


0 5 10 15 20 25 30 


SNPs in 100bp 


Extended Data Figure 1 | Summary of genome assemblies and 
heterozygosity distributions for Saccoglossus and Ptychodera. 

a, Genome statistics summary. b, c, The single nucleotide polymorphism 
distribution across 100-bp windows for Saccoglossus (b) and the 
corresponding distribution for Ptychodera (c). The distributions in b 

and c are fitted with a geometric (expected when high recombination rate 


0 5 10 15 20 25 30 


SNPs in 100bp 


is present) and a Poisson distribution (expected with low recombination 
rate). The distribution for Saccoglossus is fitted to windows with 

one or more SNPs only, as there is an excess of zero SNP windows 
(approximately 84% of total 94,324 selected windows). For methods refer 
to Supplementary Note 2. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Spo ge eo etter 
£2 pene sen rennin 
etl REPEAT SR BERATE® orem 


0.010 
1 

=| 

Z 


/TEMT FAMILY MEMBER 


roUUU" 
A 


0.005 
L 


PC2 (7%) 
5 
Fa 
° 
8 


5 
| 
OP RR TCA SE 


3 = : AN SFERASE 
3 cit A 5 ‘TED 
Sait es 
i 5s G Nl a: LISIN/KEXIN TYPE 9-RELATED 
ipa Teg a 1605 HEPARAN SULFATE D-GLUCOSAMINYL 3-O-SULFOTRANSFERASE-RELATED 
oS 2 1 % 
T T T T T Row Z-Score 
-0.006 -0.004 -0.002 0.000 0.002 
PCA (12%) 
Extended Data Figure 2 | Ambulacrarians approximate the ancestral right corner, also with the lophotrochozoans Cgi, Lgi, Hro, Cte and the 
metazoan gene repertoire. a, Principal component analysis of non-bilaterians Hma, Nve and Adi. b, Heat map of gene family counts 
Panther gene family sizes. Variances of the first two components are showing significant (Fisher’s exact test P value <0.01 after Bonferroni 
plotted in parentheses. Blue indicates deuterostomes; green indicates multiple testing correction) expansion in ambulacrarians as well as in 
lophotrochozoans; red, ecdysozoans; yellow/orange, non-bilaterian Saccoglossus/Ptychodera/amphioxus. The cases discussed in the main 
metazoans. Note the clustering of the ambulacrarians Sko, Pfl and text are highlighted in red. See Supplementary Note 4 for details. Species 
Spu with the non-vertebrate deuterostomes Bfl and Cin in the lower abbreviations are defined in Supplementary Note 4.1. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


572.6 


580.97 


589.04 


661.31 


570.37 


527.72 


546.84 


568.39 


475,21 


365.38 


3146 
389.81 


377.1 


373.19 


Extended Data Figure 3 | Molecular dating of deuterostome and 
metazoan radiations using PhyloBayes assuming a log-normal relaxed 
clock model. Yellow circles on particular nodes indicate the calibration 
dates applied from the fossil record, as indicated in Supplementary Note 
6.2. Bars are 95% credibility intervals derived from posterior distributions. 


279.99 


179.01 
2.41 


253.56 


184.63 


153.25 


Pinctada fucata 
Crassostrea gigas 

Lottia gigantea 
Chaetopleura apiculata 
Tubulanus polymorphus 
Cephalothrix linearis 
Capitella teleta 

Alvinella pompejana 
Helobdelia robusta 
Stylochoplana maculata 
Tribolium castaneum 
Drosophila melanogaster 
Daphnia pulex 

Strigamia maritima 
Romanomermis culicivorax 
Priapulus caudatus 

Mus musculus 

Homo sapiens 

Gallus gallus 

Xenopus tropicalis 
Latimeria chalumnae 
Petromyzon marinus 
Eptatretus burgeri 
Halocynthia roretzi 
Botryllus schlosseri 
Moigula tectiformis 
Ciona intestinalis 

| Branchiostoma floridae 

| - Asymmetron lucayanum 
Strongylocentrotus purpuratus 
Scaphechinus mirabilis 
Parastichopus parvimensis 
Patiria miniata 
Amphipholis sp. 
Florometra serratissima 


} Tas : Ptychodera flava 
-——epe Balanoglossus clavigerus 


Schizocardium californicum 
Saccoglossus kowalevskii 
Rhabdopleura compacta 
Praesagittifera naikaiensis 
Hofstenia miamia 

Porites astreoides 
Acropora digitifera 
Montastraea faveolata 
Nematostella vectensis 
Hydra magnipapillata 
Aurelia aurita 

Trichoplax adherens 

- Ephidatia muleleri 
Amphimedon queenslandica 
Oscarella carmella 


Note the estimated times of divergence of chordates and ambulacraria 
(the deuterostome ancestor) at 570 million years ago (Ma; mid- 


Ediacaran), hemichordates and echinoderms at 559 Ma, enteropneusts and 
pterobranchs at 547 Ma, and Harrimaniid and Ptychoderid enteropneusts 


at 373 Ma. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Pax258-like HSE 
SINE/TALE/CUT 


Ahbx-like HSE 


Lbx/ZF/POU 


5 


7 


Skowralewsit 


Skewvalevsil 


Skowalevsil 


Skowralewsii 


Skowalevsit 


Pf 


Skowalevsit 


Ex Hox 1-1a/b/e 


| —____ 
4p —$§ sca 0922_cov96 


4 pp $$ scoffs 158 


Wont HSEB 
I$ sald 612 
Sokowy20028813m 
Sakonw30038814m 


Ro Minka 
a cold_104 


Sakowwy30006172m 
Sakoerv30006178m 


ParaHox 


pil 40a 9 20150318 195425 
pl A0v0_9_ 20150316, 165430 
pil 400 9 20180516, 15432 


—$f— scaffold2509_cov 138 


Gs Xlae Cx 


Skonalenkii — ->——————— saffole_597 


DI Neds 
et cae_278 
‘Aug g8362511 

Sokoww 30027807 


pil 4tv0_9 20150316 193471 
pil 4ov0_9 20150316. 193476 
scaffold 463_covl53 


Barts Bart 
Skowalewski 4 scaffold_ 70 


Saou 30014246, 
Sokow3C014245m 


pfl 4ove_9 20150316 1413647 
ll 4ov0_9 20150316 1913652 


Pilar) | ——____¢___¢________scafold8190_cov 151 


Nea] Mo 
bce scald 13 
Soko 30044156m 
Sakny3001425%m 
Sokoy30044194m 


pill A009 20150316. 198 
pil 40v0_9 20180316 to111 
pil 400-9 20150316. 1g113 
pf 4ovo_9 20150316 19115 
—scffolis covas 
Heth} Wed2 Ux 
No 
Nive hex 


+444 > scaffold 8 
Sekowr/300s0087m 


Sokewy30040081m 
Sokew30040082m, 
Sakowr 20040088 
Sakowy 30040084 


pil 40v0_9. 20180216 110589 
pil 40v0_9 20150316 1910698 
ll Aov0_9 20180316. 1510703 


r Olp Rar 


sola ff p+» seo 7 
Sow 021875m 


Extended Data Figure 4 | Homeobox gene complement of the two 
hemichordates in comparison to that of amphioxus. The numbers of 
homeobox-containing gene models are 170 in Saccoglossus and 139 in 
Ptychodera. These homeobox domains were aligned with 128 homeobox 


genes of Branchiostoma floridae using ClustalW2, then gaps and unaligned 


regions were manually removed. Since some genes have more than 

one homeobox domain, we kept all domains or chose the longest one 
according to the state of domain conservation. In total, 448 homeobox 
sequences were aligned. See Supplementary Information for details. 

The clusters of homeobox genes on scaffolds in Saccoglossus and 
Ptychodera were identified and drawn at positions around the tree. 
Conserved clusters between the two species were aligned. In addition 

to the well-known Hox and ParaHox cluster, 17 clusters were found in at 
least one of the hemichordates or some in both. Sixteen genes of the Nkx 
class are distributed over four clusters: (i) nkxla-vent1-vent2. 1-vent2.2; 
(ii) nkx2.1-nkx2.2-msxlx; (iii) nkx5-msx-nkx3.2-nkx4-lbx-hex; and 


Sekoww3002190%m 
Sakowwe30021836n 
Sakoww 30021843 


10 


Skowalevskii 


ARTICLE 


NK7-Itke 
Vosvent —_Nk/-ike2 
scaffold_2 
Sakowv30006626m 
Sakowy30006617m 
Sakowv3000661 1m 


fl 4ovo 9 20150316 199596 


pflAGv0_9_ 20150516 199502 
ll A0v0_9 20150316, 199803, 


pil 40v0_9 201503%6_109604 
fara a, scoffole5208_covt30 


Skowalevsit 


Pao — pep. 
Dro? Drake 


Phave —__4q. 


pil 40v0_9 20150316 199474 
flava ep 
Pix Fit 


Skowalevskii gp scald 272 reverse 


16 


Skowralevsi 


18 


Peflava 


or 


Nbva Vent Nent21 vent22 


We scaffold_1707 Reverse) 


Sekowry30037078m 
Sahenv30037079m 


pfl Adve_9 20150316 1946 


A0v0_9_20150316_1y48 
ene se scaati20 covt22 


pf 40v0_9_20150315_1g4S64 
pfl 40v0_9. 20150316 194565 
pil 40v0_9. 20150316 194568 
scaffold 2089_covl 55. 
Unc Uno Uno 
pil 40v0_9 20150316 Ig1 1242 


pf A0v0_9_20150815_1g) 1244 
scoffold6664_covt5s 


Apui32 CEH? 
pil 4ov0_9 2018031619947 


scaffoldS118.covl47 


Sakowv30022054m, 
Sakowry30020061m 


Lime Lm2 
4 scaffold 69 
Sakowv30025519m 


Sakow30025505m 


POUGI FOUS2 
“aq. scaffold 171 


Sakowoy 30037132, 
Sakew/30037038m, 


pil 400 § 20150316 191646 
pbll A0v0_9_20150316_1g1649 
pi 400 9. 20150516. 191651 
> Scaold765. cores 


Sk Six1/2 SaaS 


Sh Gp ffl 71 


Skovwalewsii 


19 
20 


Skowalesski 


Skowalerski 


21 


Sokowv30026842m 
‘skowv30026345m 
Sahewev30026350, 


Phe 


Phat 
$s 


Sakeury30029740m, 
Sokowva029721M 


mp sceffold_1710 


Hypotloxt — Hypotlox2 
<q salfols 231 


Sokewrv90020920m, 
Sakeon30020916m 


(iv) voxvent-nk7like-nk7like2. The second cluster (ii) of these is part of the 


pharyngeal cluster (Fig. 4). Another five-gene cluster consists of one Lim 
class homeobox gene and four PRD class homeobox genes; isl-otp-rax- 
arx-gsc. A cluster of six3/6-six 1/2-six4/5 was found in both species, 

and a cluster of three unx genes was found only in P. flava. Ten more 
clusters were found containing two homeobox genes each. Notably, 

we found species-specific homeobox clusters in both species. Three 
remarkable clusters were found in S. kowalevskii in which 10, 12 and 

5 homeobox-containing genes are tandem duplicated in scaffold_1710, 52 
and _4796, respectively. We also found such clusters in Pflava in which 
7, 4, 8 and 10 genes are aligned on scaffold 19451, scaffold 1398, scaffold 
12422 and scaffold 154657, respectively. All homeobox genes identified 
in the genomes of the two hemichordates and amphioxus are listed in the 
Supplementary Table for Extended Data Fig. 4. This list includes some 
genes not containing a homeobox (for example, pax1/9) in cases where 
other family members do (for example, pax2). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Amphioxus 


Real 


Saccoglossus 


Xenopus 


Lottia sea urchin 


Extended Data Figure 5 | High retention rate of micro-synteny in 
Saccoglossus. Circos plot showing micro-syntenic conservation in blocks 
of genes (Mmax = 10 and nin = 2) for six metazoan species for observed 
(left) and simulated (right) linkages. The width of connecting segments is 
proportional to the number of genes participating in the syntenic linkages 
(normalized by the total gene count). In this representation scaffolds are 
placed end-to-end, and adjacent scaffolds need not be from the same 
chromosome. While simulated data yields some blocks shared between 
pairs of species, few or no synteny blocks can be recovered among three 


Ptychodera 
Xenopus 


(All species 
(I 5 species 
=a 4 species 
3 species 


[EEE 2 species 


Random 


Amphioxus 


we Ptychodera 


sea urchin 


Saccoglossus 


or more species (Methods). Saccoglossus shows one of the highest 
retentions among the selected species (and the highest among the 
sequenced ambulacrarians). Xenopus (and vertebrates in general) have lost 
some micro-synteny due to whole-genome duplications and differential 
loss of paralogues. The matching between the hemichordate S. kowalevskii 
and the chordate amphioxus is highest, consistent with the fact that neither 
genome has undergone extensive gene loss (as have tunicates) or pseudo- 
tetraploidization with extensive loss of paralogues (as have vertebrates). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Neighbor of 
a Gsx Lox Cdx ParaHox  VEGFR 


Hsa chr_13 27250kb ——_—_—_ a — 27500kb 


a i a, a ae a, 


Sko scaffold_597 0kb — | — | ——_—_ a —— §=250kb 


Pfl scaffold_2509 747kb —]]——]—— sega = 889kb 


TF TF = FT 

b Bmp 2/4 — Univin 

Sko scaffold_96 938k ———Samm— 950kb 
or or 

Pfl scaffold_11057 49k ——/E- «63kb 
eo ee 

Bfl scaffold_347 640ko ——I——lmm— 655kb 
— or 

Spu NW001346260 356kb —E——lmmi— 416kb 
S —~= = ———- 


. Xtr scaffold_719 129kb | 0s —mmmi—- 435 kb 


ee -— _& 7 -« ®& of oe : 
Sko scaffold_542 49kb —])| i /————_____BE-_- 3:18kb 
oT an ——— 


Pfl scaffold_1080 1198kb — St $$ fit — i — 1255kb 
Bfl scaffold_52 1274kb ee ee) — Sn Ses 1662kb 


er 4 wT lll OTT nnn 


PTHR11848:SF7 Lefty 

PTHR10050:SF11 Protein O-Mannosyl-Transferase 
PTHR22973:SF6 Golgi residient protein 

PTHR21661:SF2 Epoxide Hydrolase 1 

PTHR12786:SF1 Uncharacterized Splicing Factor 
PTHR11260:SF59 Glutathione S-Transferase 
PTHR13018:SF9 Uncharacterized Transmembrane Protein 
PTHR10662:SF18 Nuclear RNA Export Factor 1 


Sko scaffold_71 237kb —///——_ | ___i——_-  631kb 
—<—<$ ee es —— 
Pfl scaffold_765 598kb —jEEEN} ju eee —_ommmmmm— 978kb 
———— aes ——— —— 
Bflscaffold_52 2771kb —[__—— a ——"|—_———  2910kb 


PTHR19282:SF69 Tetraspanin-like =a] 
PTHR10390:SF13 Homeobox Protein Six] [i 
PTHR10390:SF16 Homeobox Protein Six4 [== 
PTHR10342:SF28 Arylsulfatase ——- 


e 
Sko scaffold_91  363kb oO — 684kb 


TF re nn > —— rr a 
Pfl scaffold_800 343kb —ssqss|—q0| i | _—— 689k 
or rr A or ne” ee 
Dre 13 28495kb —_-EE—_-28736kb 
TF ee 
Hsa 10 101800kb -('!—|—_ a] i 101600kb 
a er a —— er a 
PTHR11486:SF2 Fibroblast Growth Factor 8 — 
PTHR14381:SFO F-box/WD repeat domain containing protein —— 
PTHR19316:SF1 Nucleotide exchange factor SIL1 es 
PTHR24082:SF13 Peroxisome Proliferator-activated Receptor Alpha = 
PTHR23055:SF55 KCNIP K channel-interacting protein 1 eal 
PTHR22747 Nucleoplasmin Es 
Extended Data Figure 6 | Deuterostome specific micro-syntenic c-e, Loose micro-syntenic linkages with a maximum of five intervening 
linkages. a, b, Very tight linkages with no intervening genes. a, ParaHox genes: lefty (c), six1-six4 (d), and fgf8-fbxw (e)® clusters. For c to e all 
cluster shown in S. kowalevskii, P. flava, and human. b, bmp2/4 and species with micro-synteny are shown. Numbers above the genes indicate 
univin cluster in the hemichordates S. kowalevskii and P. flava, the the copy number in the locus. 


sea urchin S. purpuratus, and the cephalochordate B. floridae. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 


E 
i 
- 


Extended Data Figure 7 | Three examples showing the domain structures 
of some proteins encoded by genes found in deuterostomes and marine 
microbes but not non-deuterostome animals. Best BLASTP hits of the 
Saccoglossus sequence in human/mouse, as well as in non-deuterostome 
metazoans and in non-metazoans (such as the cyanobacterium Staniera 
cyanosphaera, or the eukaryotic micro-alga Ostreococcus tauri) are shown. 


1 190 200 300 400 500 600 642 


Query seq. 


iron-sulfur cluster 
(2Fe-251 cluster binding site 


Specific hits 


Superfanilies | Risshe superfamily | 


1 100 300 400 500 600 642 
uery seq. 
a 3 4 iron-sulfur cluster 
C2Fe-25] cluster binding site 
Specific hits (df UlaG 4 
Superfanilies 
1 50 100 150 200 250 300 323 
Query seq. 
Specific hits 
Superfanilies Lactamase_B superfamily 
1 190 200 300 400 500 S91 
Query wit Tron-su: lfur cluster 
(2Fe-251 cluster binding site 
Specific hits y UlaG 4 
Superfanilies Lactamase_B superfamily 
1 100 200 300 400 500 600 665 
Query seq. 


Specific hits 


Non-specific 
hits 
Superfanilies 


PRO 
PAD superfamily 


1 190 200 300 400 500 600 656 


PRO 
PAD superfamily 
150 200 250 


active site 9) a 


Query seq. 
Non-specific 
hits 
Superfanilies 


302 


Query seq. 


motif IT) 
Specific hits 
Superfanilies HAD_like superfamily 
1 wo ayo oyu +0 vu yy bet 
Query seq. 
Non-specific PROM PAD 
hits 
Superfanilies PAD_M superfamily PAD superfamily 
1 7s 150 225 300 37s 450 50s 
Query seq. 
Specific hits r 
Superfanilies FTO_NTD superfamily FTO_CTD superfamily 
1 ” 7s - 150 225 7 300 375 450 472 
Query seq. 
Non-specific FTO_NTD FTO_cTD 
hits 
Superfanilies FTO_NTD superfamily | FTO_CTD superfamily | 
1 125 250 375 500 625 742 
Query seq. 
Specific hits 
Superfanilies PMT_2 superfamily 
Multi-donains stt3 
. 1 ° 7s 50 225 300 375 420 


a an ae ene emer eae 


rt 
FTO_CTU 
FTO_NTD superfamily FTO_CTD superfamily 


Query seq. 
Specific hits 


Superfanilies 


Supplementary Note 10. 


a, Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (CMAH), 


© 2015 Macmillan Publishers Limited. All rights reserved 


Mouse 


Saccoglossus 


Bombus impatients 


Ostreococcus tauri 


Human 


Saccoglossus 


Nematostella vectensis 


Stanieria cyanosphaera 


Human 


Saccoglossus 


Lottia gigantea 


Ostreococcus tauri 


an enzyme of sialic acid modification; b, peptidyl arginyl deiminase (PAD), 
an enzyme of post-translational modification of proteins; c, FATSO-like, 
also called «-ketoglutarate-dependent dioxygenase FTO, an enzyme that 
de-methylates N°-methyladenosine in nuclear RNA. Other analyses of these 
and other genes with the unusual phylogenetic distributions can be found in 


ARTICLE 


a 
b Blastula Early gastrula Late gastrula Tornaria Juvenile 
VWD 1 
VWD 3 | 
vwo 4 : i 
Extended Data Figure 8 | In situ hybridization demonstration of the subregions of the ectoderm of the proboscis or collar at these pre-feeding 
expression of von Willebrand type D (vWD) domain-encoding genes stages. b, In Ptychodera, several of the genes are expressed in endoderm 
(putative glycoproteins/mucins) in Saccoglossus and Ptychodera. as well as ectoderm of the developing tornaria larva. The sequence IDs for 
a, In Saccoglossus the genes are specifically expressed in different the genes are provided in Supplementary Note $10.4. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


(Cle-G1498856_PTHAI1848_SFY_TGF-EETA_FAMILY_MEMBER_LEFTY 
PALPFLS_pA4070_8.2016031 6 1g7028.1-PTHE 1849. 6F34, BONE. MORPHOGENETIC, PROTEIN. 87__BMPST 
-G1528007_PTHA11848_SF15_ GROWTH DIFFERENTIATION, FACTOR. 11 
Ha ENSP0000047088 1, PTHAL 1848, SF26 oer 1 @ Lefty 
Lerty $ @ TGFRII 
$ @ TGF2 


1428874 PTI 11848 TGF-BETA FAMILY 
ie See era 


111848_SFB4_ MYOSTATIN. 2 
Bin oo 8PM 8S“ MYORTATIN:2 
eeumeea mee tesi211 PTA TBs, TOF DETA_FAMLY 
a-23 INTEIN BET) 


(A_B__INHEE 
'1848_SF96_SUBFAMILY_NOT_NAMED 
12_P THAI 1848s 
1530822_P THAI 1848_SF115_ SUBFAMILY NOT_NAMED 
Ge Sas ec VGF-BETA_FAMILY 
Hto-G1650307_PTHAI 


@ TGFRII 


TGF2 


3e-04 


Expression level 
04 


GDF8/11 


te-04 


?_PTHR1 1848 _SF33_TRANSFORMING_GROWTH_FACTOR_BETA_2__TGFE2 
Eee en games year MYOSTATIN 
PH-PFL3_ptt_¢ pis oeeee cea E bs sere 1O ae) NEARY: 
Ra SoaaeOOME EERE SG EETA FAM 
Bie SokawySOONSOZTHLP THIET IESE TOEEETA PAULY. 
‘Cle-G1820693_PTHA11848_SF15_GROWTH_DIFFERENT ATION FACTOR_11_GDF11 
‘Hsa-ENSPO0000260050_FTHAT 1848_: o 
RENTIATION_ FACTOR z 
LghG1423040_P THAI 1848_SF15_GROWTH_DIFFERENTIATION_FACTOR_11_GOFI1 & 
PIFPFLS_pfl_40v0_9_20150916_1g14219.11_PTHRRI 1648_SF15_GROWTH_D 
‘Sko-Swicowy3003442 1m_PTHAI 1 
Crepe aL EN 64 small early torpedo anterior _— dorsal 
cell cell gastrula = stage collar kinking 
stage blastula grove 
Saccoglossus stages 
NODAL 
PTHRI1848_TGF-BETA FAMILY 
ion Pa ene . 
{g/-G1425019_PTHAL 1848 TGF-BETA FAMILY: 
0.5 
AAADC-like1 AAADC-like2 d “= 
FE ; 7 
3 
oF 
3S 8 
= 
=| 
a 
2% 
og 
WwW 
2 
' 
a 
Nn 
8 
64 small early torpedo anterior _— dorsal 
cell cell gastrula stage collar kinking 
stage blastula grove 
Saccoglossus stages 


Extended Data Figure 9 | Gene innovation in deuterostomes. a, FastTree 
phylogenetic tree of the TGF3 family members Lefty, TGF32, GDF8/11 
and Nodal ligands (using GTR model). Bootstrap support is plotted as 
filled circles (size proportional to the support value) on each node. While 


Lefty shows deuterostome unique sequence composition, TGF32 has 
an acceleration of sequence change at the deuterostome stem branch, 


compared to the GDF8/11 or Nodal groups. b, Temporal co-expression 


of Lefty and TGF® receptor type II in Saccoglossus at pre-gastrulation 
developmental stages and of TGF32 and TGF@ receptor type I at 


post-gastrulation stages. c, In situ hybridization demonstration of the 
expression in S. kowalevskii of one of the putative type I novelty genes 
(c9orf9, also known as rsb66) and of two of AAADC genes (aromatic 
amino acid decarboxylases of the microbial type) of S. kowalevskii (also in 
P flava and B. floridae), which closely resemble sequences from bacteria 
rather than from non-deuterostome metazoans. gs, gill slits. d, The 
temporal expression profile for c9orf9 during S. kowalevskii development, 
taken from transcriptome data. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 


Examples of deuterostome gene novelties and their genomic features 


ARTICLE 


gene family name 


BLAST e-value in non-deuterostome metazoa? 


BLAST e-value in non-metazoan clades with these 
domai 


Phylogeny support for novelty/HGT 


Origin of novelty? 


Putative function in deuterostomes? 


protein 1-like 


ID # or “none” E.g,, Ostreococcus, Micromonas, ete. 
T Teity #12, partial prodomain but no ligand domain in Gapitelta = EDFO ypelv ‘antagonist of Nodal signaling, Developmental patterning? 
2 Univin, Vel, DVR, GDF weak matches to bmp2/4sequences Figure $10.31 type lV agonist of Nodal signaling. Developmental patterning? 
3 TGFb2 e-35 for cnidarian Aiptasia and sponge Sycon, no protosotmes| EDF9 type lV Signaling via Smad2/3 activation. Regulation of cell activities 
a noone, except the sponge Amphimedon has a partial Unique domain combination due to introduction of 3 aan are 
4 Thrombospondin1/2 ERATOR ETE SETH Pan haane type Il Activation of TGFb2 signals 
5 TGEDR2 Sons BES Dade deers tou leer id Orne any ana best Figure $10.3.3 type lV Receptor specific for TGFb2 
deuterostome protein kinase domain 
©-94 to &-85 for the epimerase domain in 
ne Pe rer : ere c ; ; Parcubacterium (bacteria), Prochlorococcus No non-deuterostome metazoan sequenes in gene : nen cou 
4. | UDPN-acetylglucosamine 2-epimerase/N- | symsagittifera (acoel)e-154 matching both domains, but not] (-yanopacterium); Micromonex (green microalga).e- | phylogeny, except acoels of uncertain relatedness to Uot/convergence | ftst step of deuterostome sialic acd synthesis; diferent from 
acetylmannosamine kinase in Hofstenia (acoel); no protostome match , F nage protostome first step 
44 to e-30 for the kinase domain in Mesorhizobium deuterostomes 
and other bacteria 
CMP-N-Acetyineuraminicacid hydroxylase | 2-2 for unrelated match in Caenorhabditis brenneri, neither | __e-127 to €-86, both domains, Ostreococcus and No non-deuterostome metazoan sequences in gene produces a glycolylstalicacid that is also added to 
a shear pan : : ‘ HGT/convergence 
(modifies sialic acid) domain Micromonas (green microalga) phylogeny ($10) glycocongugates in deuterastomes 
z : @-14 match to ST6Gal sialyltransferases of Tribolium, ; : 
By | ih 6 Nac staiyinanstraceCertamced ere and oer nscts witdferent pectic for 2070 «9 for Bethy eresi green mleoclga) and) | wiciea dnstarostome SToGalNAeeonne tad wasilyt ee Nee ener uicinie eared etree 
mily in Amphioxus and ambulacraria) H } Emeliania (coccolithophore) $T6Gal (bilaterian ancestral sequence) 
oligosaccharide terminus. 
as @-61 match to Oscarella (sponge) ST3 sequence, and thene-7|__, 77, ,_ : 
g_ |@lpha-2.3-sialytransferases (expanded family |“ tion to sTeGal of insects with diffrent specify for e-17 to e-10 matches to Bathycossus $T3s (green | unclear; deuterostome ST3 connected to sponge ST3 Gee New suecin city otalste aia compared eorotetamnes 
in Amphioxus and ambulacraria) : ; microalga) and Emeliania (coccolithophore) and weakly to ST6Gal (bilaterian ancestral sequence) 
oligosacharide terminus 
alpha-2/8-sialyltransferases (greatly ; ; 
i : €-7 to ST6Gal of insects, with different specificity for e-12 to e-8 matches to Bathycossus ST8s (green __ unclear; deuterostome ST8 connected weakly to sponge eaten : 
a0 Sxpanded ee aa end oligosacharide terminus microalga) and Emeliania (coccolithophore) ST3 and ST6Gal (bilaterian ancestral sequence) one Es speci Gy) Obs alle lin kage commis red top tostanies 
BAGALNT, adds N-AcetylGal to sialicacid 
11 [containing oligosaccharides (expanded family| none Nitratifractor (epsilon- proteobacterium) No non-deuterostome metazoans in gene phylogeny HGT/convergence synthesis of gangliosides, not found in protostomes 
in Amphioxus and ambulacraria) 
©-77 for NEUI in Nematostella vectensis (shares an fontron | e-68 to e-60 to NEUI-like matches in Blastopirellula, 
o) sialidase 1,2,3,4 (expanded family in boundary); e-40 for NEU1 of the sponge Oscarella. Weak | Sphingobacterium and many other bacteria. e-50-e-40| No non-deuterostome metazoan sequences in gene e1V removal of sialic acid from novel glycosidic linkages of 
Amphioxus and ambulacraria) match (>e-20) of NEU2,3,4 to Nematostella and sponge; no | to Neu5-like sequences in Solibacter and Zobellia, far phylogeny ($10) ‘yp deuterostomes 
protostome matches better than Nematostella or sponge matches. 
Proprotein convertase subtilisin/Kexin type 9-| e-59 match to Platynereis (but no inhibitor domain) and e-40 e103 to e-90 for Kexin-like sequences of in ee Rana ie RO GG IT 
13 related, pcsk9 (expanded family in amphioxus |¢o Capitella (different subtilisin type), and no intron boundary| Actinobacteria such as Saccharothrix and th 8 HGT/convergence WOE 2 eee 
; phylogeny vertebrates) 
and ambulacraria) matches Saccharomonospora 
a SUNDAE WARES 21,7 to an unrelated Ceratitis sequence that lacked PAD | ¢-66 match to Stanieria (cyanobacterium) across all | No non-deuterostome metazoan sequences in gene HGT/convergence | Posttanslational modification of proteins (change arginylto 
domains domains phylogeny citrullinyl residues) 
FATSO (Alpha-ketoglutarate-dependent ©-67 to 6-60 for matches in Micromonas and No non-deuterostome metazoan sequences in gene 
15 pia aeeae TOI Geese een Rie Gee) ieee HGT/convergence N6-methyladenosine demethylation of nuclear RNAs 
@30 to e-25 matches to Lottia and Capitella (protostomes), es rae Sa : = 
16 Arylsulfatase K-like but these match much better to a different sulfatase (2-O- ae eee fe ee ior bac itr) Se ae 10) HGT/convergence removal of sulfate groups, targets unknown 
sulfo-iduronate sulfatase) with bilaterian ancestry ( wflagellate) tr phylogeny, except 
2010 (ie, <€-200) matches to the bacteria 
; ; Metazoan fatty acid synthases e-119 to e-94, but different Flammeovirga (same order of all domains as No non-deuterostome metazoan sequences in gene ae . fea is 
aa poly stde syniiaee ke intron-exon structure and domain subtypes. deuterostome Family 1), and Sorangium, and the phylogeny EN GE ay jake Syria ats be plan en anda mob eeea 
cyanobacterium Stanieria 
15 matches to protostomian peptidylglycine alpha- z ; 5 : Z 
18 NHL-containing protein amidating monooxygenases which contains different NHL | 020 matches in planctomycete bacteria such as REET TIE A ESR TOS VEE DECG HGT/convergence protein-protein interactions? 
_ Zavarzinella; and in Monosiga (choanoflagellate) phylogeny 
repeats and a copper monoxygenase domain 
€-97 match to Selaginella (spikemoss), e-91 to 5 F . . 
19 choline monoxygenase-like >e-1 unrelated matches Physcomitrella (bryophyte), e-90 to Micromonas, | No non-deuterostome metazoan sequences in gene GT convergence arene tren oo enc eos mong peo can baDt) 
: phylogeny ($10) also a methyl donor for methionine regeneration 
Nannochloropsis, and Coccomyxa (micro-algae) 
20 | ectoine synthase-like (expanded family in S7haas uae oe e-21 to Rhizobium (bacterium ) and Cyanothece | No non-deuterostome metazoan sequences in gene HGT/convergence areaction step in the synthesis of ectoine, an osmotic 
amphioxus and hemichordates) (cyanobacterium) phylogeny protectant 
; @18 to e-4 matches to phantoyl dioxygenases of protostomes |e-69 to e-45 matches to ectoine hydroxylases of high GC| : ‘a reaction step in the synthesis of betaine, an osmotic 
oe CASI AIRES and deuterosotmes, containing a distantly related domain | grojup bacteria such as Streptomyces sp. PBH53 REI A SC AMEE oe MOU seh CAGES protectant 
histidine methyltransferase bacterial-like | 6-34 to e-7 matches to Crassostrea (ayster) sequences, which | 4. 49 matches to hacteria such as Fimbrit Crassostrea sequences more closely associated with 
22 (expanded family in amphioxus and —_|share 2-3 intron boundaries with deuterostome sequences. No|® +0 ‘0 &*0 matches fo Backers sch as Kimbriimonas) deuterostome sequences on gene phylogeny, than are type lV Syntheis of trimethyl-histidine derivatives (e.g, ergothionine)? 
: Kuenenia, and Thiothrix. é 
hemichordates other matches outside deuterostomes bacterial sequences (510) 
‘Aromatic amino acid decarboxylase microbial} ¢-46 to €-10 matches to a different kind of aromatic amino 5 : : ; : 7 = F eee = 
Bae eee VEER EET ey eee || CRE Be seis z peaches (marine No non Baismaene Riececoan seiences eens nore Decarboxylation Brera anne ace neurotransmitter 
hemichordates) bilaterian ancestry. Different intron-exon structure. OCC ERE phylogeny ($10) (Ra 
Cobalamine-independent methionine CRUD ARENA ERTS (AE RIDE MEET €-119 to e-99 matches to bacteria such as WEEE ISTIC TEATS. OES SST ENED Addtional pathway of methonine regeneration in 
24 : shares an intron boundary with the deuterostome sequences, ue deuterostome sequences on gene phylogeny, than are type lV ; 
synthase pote Planctomycetales and Rhodovibrio 5 deuterostomes? 
indicating bilaterian ancestry. No protostome match bacterial sequences 
TED Fe aa aaenTG 1 (ie, <€-200) matches to Chlamydimonas, ? @ 
a5, | Maior Facltator Transporter algal-tike, MPS | e+1.6 match to unrelated cubilin-lke protein of Microplitis | gs eer iccns micromonas, and many other micro- | No noR-deuterostome metazoan sequences in gene novos Farictatiane evita neciocs eomnnucas? 
algal-like (insect) with no MES domain ies phylogeny 
Multicopper oxidase (MCO); also called _| _e-S matches to insect laccases which contain three insect | ia : 5 F 
26 | Bilirubin oxidase-like (expanded family in | cupredoxin domains that differ greatly from bilirubin oxidase |" 2tch to Emeliania (coccolithophore) and e-65 to) No non-deuterostome metazoan sequences in gene HGT/convergence oxidation of tetrapyrroles? 
: : Albugo (oomycete) phylogeny 
tunicates, amphioxus, and hemichordates) domains 
27 FAM198-like protein none : Only deuterostome sequenes in gene phylogeny typel unknown 
28 [chromosome 9 Open Reading Frame 9, Rsboe| 15 ™atches in Nenoturbella, Presagitifera (acoel), Meara 7 Only deuterostome and acoel/nemertodermatid Reena aed Rae ee eee 
(nemertodermatid) sequences in gene phylogeny. 
29 Melanoregulin-like none : Only deuterostome sequenes in gene phylogeny typel unknown 
30_|__ Small integral membrane protein 19-like none : Only deuterostome sequenes in gene phylogeny typel unknown 
2 pare aoc eel eal coms icon eat none : Only deuterostome sequenes in gene phylogeny typel unknown 


This table summarizes the descri 


tion of the novelties in Supplementary Note 10. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature15530 


A perisinusoidal niche for extramedullary 
haematopoiesis in the spleen 


Christopher N. Inra!*, Bo O. Zhou!*, Melih Acar!, Malea M. Murphy!, James Richardson, Zhiyu Zhao! & Sean J. Morrison! 


Haematopoietic stresses mobilize haematopoietic stem cells (HSCs) from the bone marrow to the spleen and induce 
extramedullary haematopoiesis (EMH). However, the cellular nature of the EMH niche is unknown. Here we assessed 
the sources of the key niche factors, SCF (also known as KITL) and CXCL12, in the mouse spleen after EMH induction by 
myeloablation, blood loss, or pregnancy. In each case, Scf was expressed by endothelial cells and Tcf21* stromal cells, 
primarily around sinusoids in the red pulp, while Cxcll2 was expressed by a subset of Tcf21* stromal cells. EMH induction 
markedly expanded the Scf-expressing endothelial cells and stromal cells by inducing proliferation. Most splenic HSCs 
were adjacent to Tcf21* stromal cells in red pulp. Conditional deletion of Scf from spleen endothelial cells, or of Scf or 
Cxcll2 from Tcf21* stromal cells, severely reduced spleen EMH and reduced blood cell counts without affecting bone 
marrow haematopoiesis. Endothelial cells and Tcf21* stromal cells thus create a perisinusoidal EMH niche in the spleen, 
which is necessary for the physiological response to diverse haematopoietic stresses. 


The haematopoietic system employs facultative niches that arise in 
response to injury. Adult haematopoiesis occurs primarily in the bone 
marrow of mammals. However, a wide range of haematopoietic stresses 
including myelofibrosis’, anaemia”’, pregnancy*”, infection®’, myelo- 
ablation® and myocardial infarction? can induce EMH, in which HSCs 
are mobilized to sites outside the bone marrow to expand haemato- 
poiesis. The splenic red pulp is a prominent site of EMH in mice and 
humans’”-?. During EMH, HSCs are found mainly around sinusoids in 
the red pulp, raising the possibility of a perisinusoidal niche!*, CXCL12 
is expressed by sinusoidal endothelial cells in the red pulp of the human 
spleen’* and macrophage ablation reduces splenic erythropoiesis after 
irradiation!®. However, little else is known about the EMH niche. 


Niche factor expression in the spleen 

HSCs are rare in normal adult spleen’” but myeloablation with cyclo- 
phosphamide followed by daily administration of granulocyte colony- 
factor (G-CSF) induces HSC mobilization from the bone marrow to 
the spleen and induction of EMH®. Cyclophosphamide plus 21 days of 
G-CSF (Cy+21 d G-CSF) increased erythropoiesis and myelopoiesis in 
the red pulp, profoundly increasing spleen size, spleen cellularity, HSC 
number and progenitor numbers relative to control spleens (Extended 
Data Fig. 1c, f-m). 

In normal adult spleens from Scf GFP. Cx] 1 2PRe4 mice!®!9, and after 
EMH induction, Scf-green fluorescent protein (GFP) and Cxcl12- 
DsRed were primarily expressed throughout the red pulp (Fig. 1a, b and 
Extended Data Fig. la—e). Red pulp endothelial cells and perivascular 
stromal cells expressed high levels of Scf-GFP, irrespective of EMH 
induction (Fig. la-c and Extended Data Fig. 1d, e). In white pulp, Scf- 
GFP was expressed by many fewer stromal cells and central arteriolar 
endothelial cells (Fig. 1b and Extended Data Fig. le). Cxcl12-DsRed was 
not expressed by endothelial cells but was expressed by a subset of Scf- 
GFP* perivascular stromal cells, primarily around red pulp sinusoids 
and to a lesser extent around white pulp central arterioles (Fig. la-c 
and Extended Data Fig. 1d, e). 

Scf-GFP* cells were 0.48 + 0.10% of enzymatically dissoci- 
ated adult spleen cells (Fig. 1d) and Cxcl12-DsRed* cells were 
0.031 + 0.011% (Fig. 1f). Most Scf-GFP* cells (75 +5.8%) were 


VE-cadherint CD45 Ter119~ endothelial cells (Fig. 1d): 85 + 8.2% 
of all VE-cadherint CD45" Ter119~ spleen endothelial cells were Scf- 
GFP* and none expressed Cxcl12-DsRed (Fig. le). Non-endothelial 
Scf-GFP* cells were virtually all PDGFR-B CD45 Ter119~ stro- 
mal cells (Fig. 1d). Some Scf-GFP* stromal cells (22 + 3.8%) also 
expressed Cxcl12-DsRed (Fig. 1d). Virtually all Cxcl12-DsRed* stro- 
mal cells expressed Scf-GFP (Fig. 1f). Therefore, Scf was expressed by 
VE-cadherin® endothelial cells and PDGFR-(* stromal cells while 
Cxcl12 was expressed by a minority of Scf-expressing stromal cells in 
adult spleen. 

EMH induction did not appear to alter spleen Scf-GFP or Cxcl12- 
DsRed expression (Fig. la versus Extended Data Fig. 1d). Flow cyto- 
metric analysis showed no change in the fluorescence intensity of 
individual Scf-GFP* or Cxcl12-DsRed* spleen cells after EMH induc- 
tion (Extended Data Fig. 10, p). However, the frequencies and absolute 
numbers of Scf-GFP* and Cxcl12-DsRed* cells increased significantly 
upon EMH induction (Fig. 1g-j and Extended Data Fig. 1q, r). These 
cells rarely divided in normal adult spleen but proliferated upon EMH 
induction (Fig. 1j, k). 

LepR* stromal cells are the main sources of SCF and CXCL12 for 
HSC maintenance in the bone marrow!*~”. In the spleens of Lepr‘; 
R26'47mato mice, recombination occurred mainly in the white pulp, 
where HSCs are not observed"! (Extended Data Fig. 1s). Only about 
20% of Scf-GFP* stromal cells expressed LepR (Extended Data Fig. 1t). 
LepR* cells were PDGFR-8*VE-cadherin~ stromal cells that accounted 
for 37 + 13% of colony-forming-unit fibroblasts (CFU-Fs) formed by 
enzymatically dissociated spleen cells (Extended Data Fig. lu, v). 

Consistent with our prior study’’, Lepr’; Scf”~ mice had signifi- 
cantly fewer CD150*CD48"LSK HSCs in the bone marrow and sig- 
nificantly increased spleen cellularity relative to Scf */~ and Sef +/+ 
controls (Extended Data Fig. lw, x). Upon EMH induction by cyclo- 
phosphamide plus 4 days of G-CSF (Cy+4 d G-CSF), Lepr”; Scf!”— 
mice exhibited significant declines in spleen cellularity and spleen HSC 
number relative to controls (Extended Data Fig. 1x, y). While LepR* 
perivascular stromal cells could contribute to the EMH niche in adult 
spleen, the impaired EMH in these mice may also reflect bone marrow 
HSC depletion before EMH induction (Extended Data Fig. lw). 


1Department of Pediatrics and Children’s Research Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA. @Department of Pathology, University of Texas 
Southwestern Medical Center, Dallas, Texas 75390, USA. #Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA. 


‘These authors contributed equally to this work. 


466 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


GFP/DsRed/laminin 


400 um 


° 


d 0.48 + 0.10% 75 + 5.8% 92 + 5.5% 22 + 3.8% 
2 
3 
—E 
wo 
t 
[a) 
o q 
Scf-GFP PDGFR-B Cxcl12-DsRed 
e 0.40+0.13% 4 85+8.2% fF 0.031 + 0.011% 90 + 6.6% 
jeu o 
o = 
5 eC 
a x 
°o (S) 
Scf-GFP 
i_ : 
a1 oo ,, bo 50 34 
= 10 = 0.06 x 40 3 
> 0.8] 5 > - 
—— 3) o ® 30 
E 06 £ 0.04 2 x= 2) =x 
$ $ £ 20; = S 
a 0.4 fom 2 uw Wi 
oO © 0.02 = iol! 4) oT 
ir 0.2 i 3 
0.0 0.00 oa) 0 
Cxcl12-DsRed Scf* Cxcl12* Scf* Cxcl12* 


x 
¥ 
¥ 

— 


_100 tee tee 


kK 100 eM ee 
> & 
80 +EMH sex < 80 
260 ~* 9 60) = 
x) is) 2 
© 40 5 40] 4 
=) we} 
3 20 5 5 5 20 5 
2 0 0 


Whole CD45— Scf* Scft Scf Whole = Cxcl12* 
spleen cells Ter119° VE-Cad- VE-Cad*+ VE-Cad* spleen cells 


Figure 1 | Endothelial cells and perivascular stromal cells in the red 
pulp express Scf and Cxcl12 and proliferate upon induction of EMH. 

a, b, Scf-GFP and Cxcl12-DsRed were mainly expressed by stromal cells in 
the red pulp of normal spleens. b, High-magnification view of the boxed 
area in a. Dashed lines depict the boundary between white pulp (WP) and 
red pulp (RP). Arrow indicates central arteriole in the white pulp, around 
which rare stromal cells expressed Cxcl12-DsRed. c, Splenic red pulp from 
Sef?F P- Cxcl1 2°84 mice had VE-cadherin* endothelial cells (arrows) that 
expressed Scf-GFP and VE-cadherin” stromal cells (arrowheads) that 
expressed Scf-GFP and sometimes Cxcl12-DsRed. VE-Cad, VE-cadherin. 
d-f, Flow cytometric analysis of enzymatically dissociated spleen cells 
from Sef"; Cxcl12>°8*4 mice. Scf-GFP was expressed by VE-cadherin 
endothelial cells and PDGFR-B ‘VE-cadherin stromal cells, a subset of 
which also expressed Cxcl12-DsRed (d). Most VE-cadherin® endothelial 
cells were positive for Scf-GFP but negative for Cxcl12-DsRed (e). Most 
Cxcl12-DsRed* cells were positive for Scf-GFP (f). Data in a-f represent 
mean +s.d. from 3 mice from 3 independent experiments. g-I, The 
frequencies and absolute numbers of Scf-GFP* cells (g, i) and Cxcl12- 
DsRed* cells (h, j) significantly increased upon induction of EMH by 
Cy+21 d G-CSF (+EMH). k, 1, 5-Bromo-2!-deoxyuridine (BrdU) was co- 
administered to Scf°!” (k) or Cxcl12PsRe4 mice (1) along with G-CSF for 7 
days after cyclophosphamide treatment. Data represent mean + s.d. from 3 
independent experiments. The numbers of mice per treatment are shown 
on the bars in panels. g-I, Two-tailed Student’s t-tests were used to assess 
statistical significance (**P< 0.01, ***P<0.001). 


ARTICLE 


Tcf21* perisinusoidal stromal cells express Scf 
To identify cre alleles that recombine in spleen, but not bone marrow, 
stromal cells, we assessed the gene expression profile of spleen Scf- 
GFP*VE-cadherin™ stromal cells (Extended Data Table 1). After testing 
a number of cre alleles (see Extended Data Fig. 2), we found that Tcf21- 
Cre/ER (ref. 21) recombined efficiently in spleen Scf-GFP* stromal cells 
(Fig. 2a) but not in bone marrow (Fig. 2b, c). Tef21°" ER, R264 Tomato 
mice gavaged with tamoxifen for 12 days at 4-6 weeks of age expressed 
Tomato in Scf-GFP* stromal cells throughout the red pulp (Fig. 2a, d), 
whereas Tomato was expressed only in rare white pulp cells (Fig. 2a) 
and in no endothelial cells (Fig. 2d, e). Tomato™CD45~Terl19~ stro- 
mal cells from enzymatically dissociated Tof21°"”#®; R26" spleens 
accounted for 0.085 + 0.045% of spleen cells and 69 + 2% of spleen 
CFU-Fs (Fig. 2f, g). These cells were PDGFR-8* and LepR™ (Fig. 2f). 

In the liver, Scf-GFP was exclusively expressed by VE-cadherint 
endothelial cells (Extended Data Fig. 2a, b). Tef21-Cre/ER recombined 
in 0.09% of liver cells, none of which expressed Scf-GFP (Extended 
Data Fig. 2a, c). The Tcf21-Cre/ER recombination pattern did not sig- 
nificantly change in the spleen (Fig. 2f and Extended Data Fig. 2d, e), 
bone marrow (Extended Data Fig. 2f, g), or liver (Extended Data 
Fig. 2h, i) upon EMH induction by Cy+21 d G-CSF 

c-Kitt haematopoietic progenitors were almost exclusively within 
the red pulp in the normal spleen (Extended Data Fig. 3a, b) and after 
EMH induction (Fig. 2k). To assess HSC localization we used a new 
technique that permits deep imaging of «-catulin-GFP* c-Kitt HSCs in 
optically cleared haematopoietic tissues”. In the spleens of mice treated 
with Cy+4 d G-CSF, only 0.019 + 0.01% of splenocytes were a-catulin- 
GFP*c-Kit* (Fig. 2h). All long-term multilineage reconstituting cells 
in the spleen were a-catulin-GFP* and 28% of a-catulin-GFP* c-Kit* 
spleen cells gave long-term multilineage reconstitution in primary 
(Fig. 2i) and secondary irradiated recipient mice (data not shown). 

After antibody staining of a large segment of Tcf21°/#®; R26! Tomato, 
a-catulin@? spleen, we cleared the tissue (Extended Data Fig. 3c, d), 
then imaged to a depth of 300 1m and digitally reconstructed the tissue 
(Extended Data Fig. 3e, f and Supplementary Video 1). a-Catulin- 
GFP*c-Kit* HSCs were found exclusively within the red pulp, where 
80% were within 51m of Tomato* stromal cells (Fig. 2)). 


EMH requires SCF and CXCL12 from Tcf21* cells 

To test whether Tcf21°"’"8-expressing perivascular cells promote 
EMH, we treated 4—6-week-old Tef210" ER, Sef and littermate con- 
trol mice with tamoxifen for 12 days. A month later, bone marrow 
and spleen cellularity, blood cell counts, and bone marrow haemato- 
poiesis were similar in Tef21°/#®; Scf"' mice and littermate controls 
(Fig. 3a-f and Extended Data Fig. 3g-1). Then we treated Tcf217°/?*; 
Sof!" mice and littermate controls with cyclophosphamide followed 
by 4, 8, or 21 days of G-CSF. Tef21°"/#8; Scf" mice did not differ 
from controls with respect to bone marrow cellularity (Fig. 3a) or the 
numbers of HSCs (Fig. 3b), common myeloid progenitors (CMPs?3), 
granulocyte-macrophage progenitors (GMPs”*), or megakaryocyte— 
erythroid progenitors (MEPs’*) in the bone marrow after Cy+4-21 
d G-CSF treatment (Extended Data Fig. 3j-l). In contrast, Tcf21°?/?%; 
Scf!!" mice had significantly fewer splenocytes (Fig. 3c), spleen HSCs 
(Fig. 3d), CMPs (Fig. 3e), GMPs (Extended Data Fig. 3m) and MEPs 
(Fig. 3f) relative to littermate controls after Cy+8-21 d G-CSF treat- 
ment. We did not detect any difference between Tef2 1°"; Sof" mice 
and littermate controls in terms of vascular or stromal cell morphol- 
ogy in the spleen, with or without induction of EMH (Extended Data 
Fig. 4a-g). Conditional deletion of Scf with Tcf21-Cre/ER thus depletes 
HSCs and reduces EMH in the spleen without affecting bone marrow 
haematopoiesis. 

Red blood cell (RBC) and white blood cell (WBC) counts were sig- 
nificantly lower in Tef21°°/=®; Scf™" mice as compared to controls after 
Cy+8-21 d G-CSF treatment (Extended Data Fig. 3g-i). Splenectomy 
significantly reduced RBC and WBC counts in mice treated with 
Cy+G-CSE, demonstrating that splenic EMH is necessary for the 


26 NOVEMBER 2015 | VOL 527 | NATURE | 467 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


A Tef21ere/ER; R2G!dTomato, spleen 


b 7cf27 crelER- ROgGtdTomato, femur 


& 
& 
E 
Ss 
< 
jo} fo?) 
2 = 
2 ec 
io) 
Tt 
Qa 
(S) 
d Tomato 
ToT 
oO 
(6) 
uw 
< 
a 
Ts 
g 
je} 
a] 
= 
ec 
e f 
85 +.0.045% 
15% |85.% | : -EMH -EMH 
ee | = 2 0.05% +EMH +EMH 
uy Sie | t. 
re °o ied os 
Tomato Tomato PDGFR-B LepR 
g j k 
é 80 Wild type a-CatulinP gg c-Kit/GFP/Tomato 
&N 60 0.019 + 0.01% 65 
ee | = 60 
Oo ® x i‘ 
o § 40 x = 55 
Bs ° 2 50 
a . wee 
e& o GFP x 40 
i « iS 35 
2 100 2 30 
8 2 80 1: 7 GFP* cells 8 25 
2s 60 2: 7 GFP* c-Kit* cells § 20 
c : 215 
Te | | Heeeeee een a oe 
os - 7 i splenocytes 5 * BHSCs 
oO 
2 1234 0 20 40 60 80 100 (% spots) 


Figure 2 | During EMH most HSCs localize adjacent to Tcf21* stromal 
cells in the red pulp. a, Tamoxifen-treated adult Tcf217°/"®; R26!4 Tomato 
mice exhibited widespread Tomato expression by perivascular stromal 
cells in the red pulp (RP). b, c, No Tomato expression in bone marrow 
from tamoxifen-treated Tcf21°/"®; R26'¢7™!0 mice. d, e, Most 
Scf-GFP*VE-cadherin™ stromal cells were Tomato* (arrows) whereas 
Scf-GFP*VE-cadherin* endothelial cells were Tomato~ (arrowhead). 

f, Tomato*CD45~Ter119~ stromal cells from enzymatically dissociated 
spleen from Tof21°”28; R26"42"4!0 mice were positive for PDGFR-B 

but negative for LepR, irrespective of EMH induction by Cy+G-CSF. 

g, Percentage of all CFU-F colonies formed by enzymatically dissociated 
Tof21°/ER; R26'41™A10 spleen cells that were Tomato*. Macrophage 
colonies were excluded by staining with anti-CD45 antibody. h, a-Catulin- 
GFP*c-Kitt HSCs represented 0.019 + 0.01% of dissociated spleen cells in 
a-catulin®’? mice with EMH. i, a-Catulin-GFP* c-Kitt splenocytes were 
highly enriched for long-term multilineage reconstituting (LTMR) HSCs. 
j, k, Deep imaging of a-catulin-GFP* c-Kitt HSCs (arrows in k) in optically 
cleared spleen from a Tef21°’"8; R26'4!omato. «-catulin?? mouse with EMH 
induced by Cy+21d G-CSF. The distance from a-catulin-GFP* c-Kit* 
HSCs or random spots to Tomato* stromal cells (j; *P < 0.05 by two-tailed 
Student's t-test). a-Catulin-GFP* c-Kit HSCs were exclusively in the red 
pulp (k; see Extended Data Fig. 3f for a low-magnification view). All data 
reflect mean + s.d. from 3 mice in 3 independent experiments. 


recovery of blood cell counts (Fig. 3g, h and Extended Data Fig. 3n). 
However, conditional deletion of Scfby Tcf21-Cre/ER did not further 
reduce blood cell counts in splenectomized mice (Fig. 3g, h). SCF 
expression by Tcf21* stromal cells in the spleen is thus necessary for 
the regeneration of blood cells after Cy+G-CSF treatment. 

Bone marrow cellularity and bone marrow haematopoiesis were 
similar in Tcf21 ere/ER, Cx¢]12/”— mice and littermate controls, before 
and after Cy+-G-CSF treatment (Fig. 3i, j and Extended Data Fig. 3r-t). 
However, Tef21/#8; Cxcl12/”~ mice exhibited significantly reduced 
spleen cellularity (Fig. 3k) and numbers of spleen CMPs, GMPs and 


468 | NATURE | VOL 527 | 26 NOVEMBER 2015 


a BM cellularity b BM HSCs c SP cellularity 
= 80 Scfti/i & 6. o- 
2 Tof21cre/ER- Scfil/l =} = 8 
x 3 x x 
=f = 4! 5 6 
ne} a pe} 
eo EG ee ie 
2 20 2 22 
3 A a eee 3 5 
Oo 0 oO 0 — 00 
NT 4 8 21 NT 4 8 21 NT 4 8 21 
d e f 
~ 3,SPHSCs a SP CMPs _ . SPMEPs 
& °)33 33 85 77 615733 33 55 77 & 3133 33 55 77 
6 
z *» S104 ae 
oO 4 oO oO kk 
ne} Q 2 oer 
5% Eos » ppe & 1] 
sche i 5 ola 
& ole & o1—tide & oli — 
NT 4 8 21 NT 4 8 21 NT 4 8 21 
g h 
Sham operation: S 80 S15 tt 
m Sef" % 60 at & ~ _it 
WT cf21°°°/ER: Soffilfl = tt 310 = anes 
: 5 40 oo 5 oie 
Splenectomized: foe ke g 
See 
= ll V/ER. f/f 3 = 8 * 
cre/ER. = g 
Wi Tcf21°°°/ER: Sof a. a | 
NT 24 NT 24 
i BM cellularity J BMHSCs K  sP cellularity 
° 60 Control ° 8 ak rc) 8 
x Tef21ere/ER; Cxci12i- x & x6 
2 a4 44 33 B4 ia 
E E E * 
= z22 22 
3 80 8 o LER) Ea Eid int 
NT 4 8 21 NT 4. 8 21 
L SP CMPs Ns sP MEPs ° 
S10 o4 3300 
= iS 8 # 
x8 x3 2 
36 33 44 33 10105 200 
2 we 2 sad = 
54 5, Boo 
0 <i BB 3122 es 
00 © 0 Lala Ee cr 0 
NT 4 8 21 NT 4 8 21 NT 4 8 21 24 


Figure 3 | Tcf21-expressing stromal cells are an important source of 
SCF and CXCL12 for EMH in the spleen. a-f, Tef217” ER, Soff and 
Scf!“"" control mice were treated with tamoxifen then examined 1 month 
later either under normal conditions (not treated (NT)) or after treatment 
with Cy+4-21 d G-CSF to induce EMH. The number of bone marrow 
(BM) cells (a) and bone marrow CD150*CD48~LSK HSCs (b) in one 
femur plus one tibia as well as spleen (SP) cellularity (c) and the numbers 
of HSCs (d), CMPs (e) and MEPs (f) in the spleen. g, h, Sham-operated 
and splenectomized mice were treated with Cy+21d G-CSF 1 month 
after surgery: WBC (g) and RBC (h) counts are shown. i-o, Tef21°"*; 
Cxcl12/”~ and Cxcl12*/~ or Cxcl12/”— control mice were treated with 
tamoxifen then examined 1 month later either under normal conditions 
(NT) or after treatment with Cy+4-21d G-CSF to induce EMH. The 
number of bone marrow cells (i) and bone marrow HSCs (j) in one femur 
plus one tibia as well as spleen cellularity (k), numbers of HSCs (1), CMPs 
(m) and MEPs (n) in the spleen are shown. 0, Number of HSCs per ml of 
blood in tamoxifen-treated control and Tef21°7% ER. Cxcl12/”~ mice after 
Cy+21 d G-CSE. The numbers of mice per treatment are shown in each 
bar in each panel. All panels reflect mean + s.d. from three independent 
experiments. *P < 0.05, **P< 0.01, ***P< 0.001, statistical significance 
relative to sham-operated Scf!" mice. P< 0.05, ttP < 0.01, statistical 
significance among other treatments. 


MEPs (Fig. 3m, n and Extended Data Fig. 3u) relative to controls after 
Cy+8-21 d G-CSF treatment. Although the number of HSCs in the 
spleens of Tef21’"®; Cxcl12/”~ mice did not significantly differ from 
littermate controls (Fig. 31), HSC numbers were significantly elevated 
in the blood (Fig. 30) and in the bone marrow (Fig. 3j) of Tef21°/?%; 
Cxcl12/”~ mice after Cy+21 d G-CSF treatment. This suggests that 
some HSCs were mobilized from the spleens of Tcf21 ere/ER. Cyc] 2 — 
mice. Tef21°"”®8; Cxcl12/”~ mice also had significantly reduced RBC 
counts after Cy+21 d G-CSF treatment (Extended Data Fig. 30-q). 
We did not detect any difference between Tcf21°’=®; Cxcl12/”— 
mice and littermate controls in terms of the frequency or morphol- 
ogy of vascular or stromal cells in the spleen, with or without EMH 


© 2015 Macmillan Publishers Limited. All rights reserved 


(Extended Data Fig. 4h-n). Tcf21-Cre/ER-expressing stromal cells are 
thus an important source of CXCL12 for spleen EMH but not bone 
marrow haematopoiesis. 


EMH requires SCF from endothelial cells 

We discovered that Vav1-Cre recombines efficiently in spleen, but 
not bone marrow, endothelial cells. Vav1-cre; R26!" mice recom- 
bined throughout the red pulp in VE-cadherin* Scf-GFP* cells but 
only in rare white pulp cells (Fig. 4a-c). VE-cadherin * Scf-GFP* cells 
accounted for 0.37 + 0.07% of enzymatically dissociated spleen cells 
and 83 + 5.3% of these cells recombined with Vav1-Cre (Fig. 4b). These 
cells were negative for PDGFR-8 (Extended Data Fig. 5a). Seventy + 5% 
of VE-cadherin* endothelial cells were Tomato* in the spleens of 
Vav1-cre; R26'4!™4!0 mice but only 8.4+0.5% were Tomato* in 
bone marrow (Extended Data Fig. 5b, e-h). Endothelial cells from 
Vav1-cre; Scfl”~ mice exhibited a 6.5-fold reduction in Scf transcript 
levels (Extended Data Fig. 5c) and a 5.6-fold reduction in SCF protein 
(Extended Data Fig. 5d) relative to endothelial cells from Scf/”— 
controls. 

In the livers of Vav1-cre; R26'¢™4, Soft P mice recombination 
occurred in 26 + 4.2% of VE-cadherin *Scf-GFP* cells (Extended Data 
Fig. 5i-k). Upon induction of EMH by Cy+G-CSF, Vav1-Cre recom- 
bination did not significantly change in the spleen (Extended Data 
Figs 5b and 6a, b), bone marrow (Extended Data Figs 5b and 6c, d) or 
liver (Extended Data Fig. 6e, f). 

Cxcl12 was not expressed by spleen endothelial cells (Fig. le). 
Consistent with this, Vav1-cre; Cxcl12!/”~ mice had normal blood 
counts, cellularity, and numbers of HSCs, CMPs, GMPs and MEPs in 
bone marrow and spleen after Cy+G-CSF (Extended Data Fig. 6g-s). 

Vav1-Cre also recombines in haematopoietic cells** but haemato- 
poietic cells do not express Scf and Vav1-cre; Scf!”~ mice have nor- 
mal HSC frequency and haematopoiesis in bone marrow!®". Prior to 
EMH induction with Cy+G-CSE, Vav1-cre; Scf!”~ mice did not sig- 
nificantly differ from Scf”~ controls with respect to bone marrow or 
spleen cellularity, or the numbers of HSCs, CMPs, GMPs or MEPs in 
the bone marrow or spleen (Fig. 4d-i and Extended Data Fig. 6w-z). 
After Cy+G-CSF treatment, bone marrow cellularity and numbers of 
bone marrow HSCs, CMPs, GMPs or MEPs in Vav1-cre; Sof” ~ mice 
were normal (Fig. 4d, e and Extended Data Fig. 6w-y). However, RBC 
counts, spleen cellularity, and the numbers of spleen HSCs, CMPs 
and MEPs declined in Vav1-cre; Scf!”~ mice relative to Scf/~ controls 
(Fig. 4f-i and Extended Data Fig. 6t-v). 

The decline in blood cell counts in Vav1-cre; Scf™ ~ mice after EMH 
induction was caused by reduced spleen EMH because splenectomy 
significantly reduced RBC and WBC counts but conditional deletion 
of Scfin splenectomized Vav1-cre; Scf!”~ mice had no further effect on 
blood cell counts (Fig. 4j, k). We did not detect any difference between 
Vav1-cre; Scf!”~ mice and controls in terms of the frequency or mor- 
phology of vascular or stromal cells in the spleen (Extended Data 
Fig. 40-u). Endothelial SCF expression is thus necessary for splenic 
EMH and the recovery of blood cell counts after Cy-+G-CSF. 


The splenic EMH niche during pregnancy 

Erythropoiesis and myelopoiesis significantly increased in the red pulp 
during pregnancy, profoundly increasing spleen cellularity, HSC num- 
ber, and progenitor numbers relative to non-pregnant mice (Extended 
Data Fig. 7a-i). Just as in Cy+-G-CSF-treated mice, Scf-GFP was largely 
expressed by endothelial and perivascular stromal cells in the red pulp 
and Cxcl12-DsRed was expressed by a subset of the Scf-GFP* stro- 
mal cells (Extended Data Fig. 7j-l). Pregnancy induced these cells to 
proliferate, significantly expanding their numbers (Extended Data 
Fig. 7m-o). In pregnant mice, Tcf21-Cre/ER recombined in spleen 
PDGFR-$*LepR™ stromal cells but not in bone marrow and rarely in 
liver (Extended Data Fig. 7p—v). Vav1-cre; Sof mice were infertile, 
preventing us from testing the endothelial contribution to EMH during 
pregnancy. 


ARTICLE 


Vav1-cre; R26'ATomato. ScfGFP 


0.37 + 0.07% 83 + 5.3% 


a Vav1-cre; R26'Tomato 


Tomato/laminin 


c 

TT 

oO 

oO 

wi 

= 

a 

im 

oO 

3 

7] 

= 

co} 

nS; 

d e 

S 6, BM cellularity S51 

x4 x4) 

5 | 5 34 

pe} Q 

5 24 5 

= £14 

8 oliaan Gils WY! EEE) 8 9 | 

NT 4 8 21 

9g h i 

© 8, SP HSCs = 2, SP.CMPs = 6 SP MEPs 

ee = |sss 777 444 999 = |sss 777 

x64 x x 

Sal By 8 

E € x £2 

=] 2 i we 

= = = 

oo So! $0 

ONT 4.8 21° 
= 25, = 15 

Sham operation: ; = 00 =} t 
w Scfi- = = 101 
gw Vav1-cre; Scft/- 5 5 5 
Splenectomized: a 10 - 54 
i Scftl- ee B 
@ Vav1-cre; Scft- = 0- a U- 


NT 21 
Figure 4 | Endothelial cells are an important source of SCF for EMH 

in the spleen. a, Vav1-cre; R267""*"° mice exhibited vascular Tomato 
expression throughout the splenic red pulp (RP). Tomato was also 
expressed by haematopoietic cells in these mice but levels of Tomato 
expression in endothelial cells were ~10-100 fold higher than in 
haematopoietic cells. Therefore short-exposure images showed mainly 
Tomato fluorescence in endothelial cells. WP, white pulp. b, c, Vav1-Cre 
recombined in VE-cadherin* Scf-GFP* endothelial cells (arrowheads in c) 
but not in VE-cadherin” Scf-GFP* perivascular stromal cells (arrows in c). 
d-i, Vav1-cre; Sof” ~ mice and Sof” ts Sof” ~ controls were not treated 
(NT) or treated with Cy+4-21 d G-CSF to induce EMH. Data show the 
number of bone marrow (BM) cells (d) and bone marrow HSCs (e) in one 
femur plus one tibia as well as spleen (SP) cellularity (f) and the numbers of 
HSCs (g), CMPs (h) and MEPs (i) in the spleen. j, k, WBC (j) and RBC (k) 
counts in splenectomized and sham-operated mice before and after 
Cy+21 d G-CSF treatment. The numbers of mice per treatment are shown 
in the bars in each panel. All data reflect mean +s.d. from 3 (a-c; j, k) 

or 6 (d-i) independent experiments. d-i, *P < 0.05, **P< 0.01, 

**+*P < 0.001, statistical significance relative to Sof, +P <0.05, 
+tP<0.01, tttP < 0.001, statistical significance between Scf/”~ and 
Vav1-cre; ScflV— mice. j,k, *P < 0.05, **P< 0.01, ***P< 0.001, statistical 
significance relative to sham-operated Soff’. +P <0.05, ttP< 0.01, 
statistical significance between other treatments. 


Pregnant Tcf21°”"®; Scf!" females did not differ from Scf/" control 
females in terms of bone marrow cellularity (Fig. 5a), or the numbers of 
HSCs (Fig. 5b), GMPs, CMPs or MEPs in the bone marrow (Extended 
Data Fig. 8a—d). In contrast, pregnant Tcf21°"”"®; Scf!”" females exhib- 
ited significantly lower spleen cellularity and numbers of HSCs, GMPs, 
CMPs, MEPs, ye hae and erythroid cells in the spleen as compared 
to pregnant Scf!“" females (Fig. 5c-f and Extended Data Fig. 8¢, f). 
Pregnant Tcf21°’"®; Scf!"" females had significantly lower RBC counts 
than pregnant Scf!“" controls (Fig. 5g), and significantly lower fetal 
mass (Fig. 5h). SCF from Tcf21* perivascular cells is thus necessary for 
splenic EMH and for the expansion of erythropoiesis during pregnancy. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 469 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Pregnany @ BMcellularity b BMHSCs © SP cellularity d SP HSCs 
Normal: Ss ° S : »& a eK Zs ek 
A/F x es a x3 x 20 

i Scf x4 Xo x Se 
Rd rs ra ~ 45 
Pregnant: 3 3 B2 3 
i Sef!” 52 51 | ttt § 10 tt 
Hi Tcf21cre/ER- Scftl/ 5 = = = 5 : 
om) oo 00 0 0 
e@  SPCMPs f =  SPMEPs g h 
< 20 o 4 eo l2 2.0 
iS) ee ie) ia S tome 
x 15 x 3 x9 tt 9 1.5 tt 
8 10 $2 = 6 © 1.0 
E E— |6 6 a = 
2 516 6 21 2 3 $05 
= +tt = ttt fe) ky 
o oO a 
Oo 0 00 £0 o! 
Blood loss . é . k : 
Normal: ll Scft/ 1 BMcellularity J = BMHSCs SP cellularity 
im Soft 3° G25) * « et 
= = oF 
m@ Vav1-cre; x 6 x20 O34 
Scftlfl 5 515 5 
Bled: BW Tcf27cre/ER: e 4 ¢ a 2 
Sef 22 310 24 
@ Vav1-cre; = = 5 = 
Com oS PERS NS Koma STS ESESES fom 
Scftla 
I SP HSCs m SP MEPs o 
a4 ge 5 4 = 
é e) ee 
x3 x 3 x9 
» re 3 = 
oO Oo 
2 2 ra 2 2 5 6 
21 24 1 83 
@ oO a 
So 30 0 cog 


Figure 5 | SCF from endothelial cells and Tcf21* stromal cells is 
necessary for splenic EMH and adequate erythropoiesis after bleeding 
or during pregnancy. a-h, Four-to-six-month-old female mice that had 
been treated with tamoxifen at least 2 months earlier were mated with 
normal wild-type males. a-f, Normal females and pregnant females at 
gestation day 18.5 were analysed: the number of bone marrow (BM) cells 
(a) and bone marrow HSCs (b) in one femur plus one tibia as well as 
spleen (SP) cellularity (c), and the numbers of HSCs (d), CMPs (e) and 
MEPs (f) in the spleen are shown. g, h, RBC counts (g) and fetal mass (h). 
i-n, Four-to-six-month-old mice with the indicated genetic backgrounds 
were repeatedly bled over a 2-week period then analysed: the number of 
bone marrow cells (i) and bone marrow HSCs (j) in one femur plus one 
tibia as well as spleen cellularity (k), and the numbers of HSCs (1), CMPs 
(m) and MEPs (n) in the spleen are shown. 0, RBC counts. The numbers 
of mice per treatment are shown in each bar of each panel. All data reflect 
mean + s.d. from 4 (a-h) or 3 (i-o) independent experiments. *P < 0.05, 
**P< 0.01, ***P< 0.001, statistical significance relative to normal mice. 
+P <0.05, tt P< 0.01, statistical significance between single mutants 

and compound mutants. +P < 0.05, +P < 0.01, $44P < 0.001, statistical 
significance between Scf mutant mice and control mice after bleeding or 
pregnancy. 


Pregnant Tcf21°°"®; Cxcl12!”~ females also had significantly reduced 
splenic cellularity and splenic erythropoiesis relative to pregnant 
Cxcl12/”~ controls, without any changes in bone marrow haemato- 
poiesis (Extended Data Fig. 8i-x). 


The splenic EMH niche after blood loss 

Repeated bleeding significantly increased erythropoiesis and mye- 
lopoiesis in the red pulp, increasing spleen cellularity, HSC number, 
and progenitor numbers relative to non-bled controls (Extended Data 
Fig. 9a-i). Just as in Cy+-G-CSF-treated mice, Scf-GFP was largely 
expressed by endothelial cells and perivascular stromal cells in the red 
pulp while Cxcl12-DsRed was expressed by a subset of Scf-GFP* stro- 
mal cells (Extended Data Fig. 9j-1). Blood loss induced the prolifera- 
tion of these cells, significantly expanding their numbers (Extended 
Data Fig. 9m-o). In bled mice, Tcf21-Cre/ER recombined in red pulp 
PDGER-8tLepR— stromal cells, but not in bone marrow and rarely in 
liver (Extended Data Fig. 9p-v). Vav1-Cre recombined in 66 + 4.2% of 


470 | NATURE | VOL 527 | 26 NOVEMBER 2015 


spleen endothelial cells, mainly in the red pulp, but only in 7.5 + 4.0% of 
bone marrow endothelial cells and 25 + 5.8% of liver endothelial cells 
(Extended Data Fig. 10a-h). 

Bled Tcf21°"®; Scf/" mice or Vav1-cre; Scf!" mice did not differ 
from bled Scf' controls in bone marrow cellularity (Fig. 5i), or the 
numbers of HSCs (Fig. 5j), GMPs, CMPs or MEPs in the bone marrow 
(Extended Data Fig. 10i-1). In contrast, bled Tcf21°/#8; Scf!" mice 
and Vav1-cre; Scf!“" mice each had significantly lower RBC counts, 
spleen cellularity, and numbers of HSCs, GMPs, CMPs, MEPs, myeloid 
and erythroid cells in the spleen as compared to bled Scf" controls 
(Fig. 51-o and Extended Data Fig. 10m-p). Tcf217 stromal cells and 
endothelial cells are thus necessary for EMH in the spleen and for the 
expansion of erythropoiesis after bleeding. 

Endothelial and Tcf21* stromal cells had additive effects on splenic 
EMH and the recovery of RBC counts after bleeding. Bled Vav1-cre; 
Tef21°”ER; ScfM" mice had similar bone marrow cellularity and num- 
bers of HSCs in the bone marrow as bled Scf!“" controls (Fig. 5i, j). 
However, they had significantly reduced RBC counts, spleen cellular- 
ity, and numbers of HSCs, MEPs and erythroid cells in the spleen as 
compared to bled Scf/" mice, bled Vav1-cre; Scf!"" mice, and bled 
Tef21*®; Scf" mice (Fig. 5k-n and Extended Data Fig. 10p). 

Bled Tef21°/#®; Cxcl12/”~ mice also had significantly reduced 
cellularity, MEPs and erythroid cells in the spleen as well as signifi- 
cantly reduced RBC counts as compared to bled Cxcl12/”~ controls, 
without any differences in bone marrow haematopoiesis (Extended 
Data Fig. 10q-e’). 

The EMH niche in mouse spleen is created by endothelial cells and 
Tcf21-expressing stromal cells associated with red pulp sinusoids and 
is functionally important for haematopoietic recovery from a range of 
stresses. A prior study’? detected CXCL12 expression in endothelial 
cells in human spleens. This suggests that endothelial cells are also 
a component of the EMH niche in humans, but there may be spe- 
cies differences in CXCL12 expression among niche cells. It is not 
clear whether there is any relationship between the Cxcl12-abundant 
reticular (CAR) cells that are part of the bone marrow niche” and the 
Cxcl12-expressing stromal cells in the splenic EMH niche. While bone 
marrow CAR cells are LepR* and Tcf21~, spleen CAR cells are Tcf21* 
and LepR-. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 5 July; accepted 1 September 2015. 
Published online 16 November 2015. 


1. Abdel-Wahab, O. |. & Levine, R. L. Primary myelofibrosis: update on definition, 

pathogenesis, and treatment. Annu. Rev. Med. 60, 233-245 (2009). 

2. Cheshier, S. H., Prohaska, S. S. & Weissman, I. L. The effect of bleeding on 

hematopoietic stem cell cycling and self-renewal. Stem Cells Dev. 16, 707-718 

(2007). 

3. Bennett, M., Pinkerton, P. H., Cudkowicz, G. & Bannerman, R. M. Hemopoietic 

progenitor cells in marrow and spleen of mice with hereditary iron deficiency 

anemia. Blood 32, 908-921 (1968). 

4. akada, D. et al. Oestrogen increases haematopoietic stem-cell self-renewal in 

females and during pregnancy. Nature 505, 555-558 (2014). 

5. Fowler, J. H. & Nash, D. J. Erythropoiesis in the spleen and bone marrow of the 

pregnant mouse. Dev. Biol. 18, 331-353 (1968). 

6. Baldridge, M. T., King, K. Y., Boles, N. C., Weksberg, D. C. & Goodell, M. A. 

Quiescent haematopoietic stem cells are activated by IFN-y in response to 

chronic infection. Nature 465, 793-797 (2010). 

7. Burberry, A. et al. Infection mobilizes hematopoietic stem cells through 

cooperative NOD-like receptor and Toll-like receptor signaling. Cel! Host 

Microbe 15, 779-791 (2014). 

8. Morrison, S. J., Wright, D. E. & Weissman, |. L. Cyclophosphamide/granulocyte 

colony-stimulating factor induces hematopoietic stem cells to proliferate prior 

‘0 mobilization. Proc. Nat! Acad. Sci. USA 94, 1908-1913 (1997). 

9. Dutta, P. et al. Myocardial infarction accelerates atherosclerosis. Nature 487, 

325-329 (2012). 

10. Lowell, C. A., Niwa, M., Soriano, P. & Varmus, H. E. Deficiency of the Hck and Src 

yrosine kinases results in extreme levels of extramedullary hematopoiesis. 
Blood 87, 1780-1792 (1996). 

11. Freedman, M. H. & Saunders, E. F. Hematopoiesis in the human spleen. Am. J. 
Hematol. 11, 271-275 (1981). 


© 2015 Macmillan Publishers Limited. All rights reserved 


20. 


21. 


22. 


23. 


. Tavassoli, M. & Weiss, L. An electron microscopic study of spleen in 


myelofibrosis with myeloid metaplasia. Blood 42, 267-279 (1973). 


. Johns, J. L. & Christopher, M. M. Extramedullary hematopoiesis: a new look at 


the underlying stem cell niche, theories of development, and occurrence in 
animals. Vet. Pathol. 49, 508-523 (2012). 


. Kiel, M. J. et al. SLAM family receptors distinguish hematopoietic stem and 


progenitor cells and reveal endothelial niches for stem cells. Ce// 121, 
1109-1121 (2005). 


. Miwa, Y. et al. Up-regulated expression of CXCL12 in human spleens with 


extramedullary haematopoiesis. Pathology 45, 408-416 (2013). 


. Chow, A. et al. CD169* macrophages provide a niche promoting erythropoiesis 


under homeostasis and stress. Nature Med. 19, 429-436 (2013). 
orita, Y. et al. Functional characterization of hematopoietic stem cells in the 
spleen. Exp. Hematol. 39, 351-359 (2011). 


. Ding, L. & Morrison, S. J. Haematopoietic stem cells and early lymphoid 


progenitors occupy distinct bone marrow niches. Nature 495, 231-235 
(2013). 


. Ding, L., Saunders, T. L., Enikolopov, G. & Morrison, S. J. Endothelial and 


perivascular cells maintain haematopoietic stem cells. Nature 481, 457-462 
(2012). 
Zhou, B. O., Yue, R., Murphy, M. M., Peyer, J. G. & Morrison, S. J. Leptin- 
receptor-expressing mesenchymal stromal cells represent the main 
source of bone formed by adult bone marrow. Cell Stem Cel! 15, 154-168 
(2014). 
Acharya, A., Baek, S. T., Banfi, S., Eskiocak, B. & Tallquist, M. D. Efficient 
inducible Cre-mediated recombination in Tcf21 cell lineages in the heart and 
kidney. Genesis 49, 870-877 (2011). 

Acar, M. et al. Deep imaging of bone marrow shows non-dividing stem cells are 
mainly perisinusoidal. Nature 526, 126-130 (2015). 

Akashi, K., Traver, D., Miyamoto, T. & Weissman, I. L. A clonogenic common 
myeloid progenitor that gives rise to all myeloid lineages. Nature 404, 
193-197 (2000). 


ARTICLE 


24. de Boer, J. et al. Transgenic mice with hematopoietic and lymphoid specific 
expression of Cre. Eur. J. Immunol. 33, 314-325 (2003). 

25. Sugiyama, T., Kohara, H., Noda, M. & Nagasawa, T. Maintenance of the 
hematopoietic stem cell pool by CXCL12-CXCR4 chemokine signaling in bone 
marrow stromal cell niches. /mmunity 25, 977-988 (2006). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements SJ.M. is a Howard Hughes Medical Institute Investigator, 

the Mary McDermott Cook Chair in Pediatric Genetics, the director of the Hamon 
Laboratory for Stem Cells and Cancer, and a Cancer Prevention and Research 
Institute of Texas Scholar. B.O.Z. was supported by a fellowship from the Leukemia 
and Lymphoma Society. We thank N. Loof and the Moody Foundation Flow 
Cytometry Facility, K. Correll and M. Gross for mouse colony management, and 

E. Olson and J. Mendell for providing Cre lines. This work was supported by the 
National Institutes of Health National Heart, Lung, and Blood Institute (HLO97 760). 


Author Contributions C.N.I. identified the cre alleles used in this study and 
analysed Scf and Cxc/12 conditional knockout mice after Cy+G-CSF treatment. 
B.0.Z. characterized the stromal cells in the spleen and analysed Scf and Cxcl12 
conditional knockout mice after blood loss and pregnancy. M.A. generated and 
characterized the a-catulin@ mice. M.M.M. analysed HSC localization in the 
spleen. Z.Z. performed all statistical analyses. J.R. examined spleen histology. 
C.N.I., B.O.Z., M.A., M.M.M. and S.J.M. designed the experiments and interpreted 
the results. C.N.I., B.O.Z. and SJ.M. wrote the manuscript. 


Author Information Microarray data have been deposited in the Gene 
Expression Omnibus under accession number GSE71288. Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare no competing financial interests. Readers are welcome to comment 
on the online version of the paper. Correspondence and requests for materials 
should be addressed to S.J.M. (Sean.morrison@utsouthwestern.edu). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 471 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


METHODS 

Mice. All mice were maintained on a C57BL/6 background, including Scf@!” 
(ref. 19), Scf!”* (ref. 19), Cxcl12?°®"4 (ref. 18), Cxcl12/”* (ref. 18), R26'¢7"” (ref, 
26), Vav1-cre (ref. 24), Lepr” (ref. 27), Tef217°"® (ref. 21) and a-catulin“?. To 
induce Cre/ER activity in Tcf21/#® mice, 4-6-week-old mice were administered 
2mg tamoxifen (Sigma) daily by oral gavage for 12 consecutive days. For induction 
of EMH, mice were injected at day 0 with a single dose of 4 mg cyclophospha- 
mide followed by daily injections of 5\1g G-CSF for 4-21 days. Both male and 
female mice were used. All mice were housed in the Animal Resource Center at the 
University of Texas Southwestern Medical Center (UTSW). All procedures were 
approved by the UTSW Institutional Animal Care and Use Committee. 

Flow cytometric analysis of haematopoietic cells. Bone marrow cells were iso- 
lated by flushing the femur or tibia with Ca**- and Mg”*-free HBSS with 2% 
heat-inactivated bovine serum using a 3 ml syringe fitted with a 25-gauge nee- 
dle. Spleen cells were obtained by crushing the spleen between two frosted slides. 
The cells were dissociated to a single-cell suspension by gently passing through 
the needle several times and then filtering through a 40-j1m nylon mesh. Blood 
was collected by cardiac puncture, and white blood cells were isolated by ficoll 
centrifugation according to the manufacturer's instructions (GE Healthcare). The 
following antibodies were used to isolate HSCs: anti-CD150 (TC15-12F12.2), 
anti-CD48 (HM48-1), anti-Sca-1 (E13-161.7), anti-c-kit (2B8) and the following 
antibodies against lineage markers (anti-Ter119, anti-B220 (6B2), anti-Gr-1 (8C5), 
anti-CD2 (RM2-5), anti-CD3 (17A2), anti-CD5 (53-7.3) and anti-CD8 (53-6.7)). 
Haematopoietic progenitors were identified by flow cytometry using the following 
antibodies: anti-Sca-1 (E13-161.7), anti-c-Kit (2B8) and the following antibodies 
against lineage markers (anti-Ter119, anti-B220 (6B2), anti-Gr-1 (8C5), anti-CD2 
(RM2-5), anti-CD3 (17A2), anti-CD5 (53-7.3) and anti-CD8 (53-6.7)), anti-CD34 
(RAM34), anti-CD135 (Elt3) (A2F10), anti-CD16/32 (FcyR) (93), anti-CD127 
(IL7Ra) (A7R34), anti-CD24 (M1/69), anti-CD43 (1B11), anti-B220 (6B2), 
anti-IgM (II/41), anti-CD3 (17A2), anti-Gr-1 (8C5), anti-Mac-1 (M1/70), 
anti-CD41 (MWReg30), anti-CD71 (C2) and anti-Ter119. 4’,6-Diamidino-2- 
phenylindole (DAPI) was used to exclude dead cells. Antibodies were obtained 
from eBioscience or BD Bioscience. 

Flow cytometric analysis of stromal cells. To isolate bone marrow stromal cells 
the marrow was gently flushed out of the bone marrow cavity with a 3-ml syringe 
fitted with a 23-guage needle and then transferred into 1 ml pre-warmed bone 
marrow digestion solution (200 U ml“! DNase I (Sigma), 250j1g ml! Liberase”” 
(Roche) in HBSS plus Ca?+ and Mg”*) and incubated at 37 °C for 30 min with gen- 
tle shaking. To isolate splenic stromal cells, the spleen capsule was cut into ~1 mm? 
fragments using scissors and then digested as described earlier in spleen digestion 
solution (200 U ml“! DNase I, 250,j1g ml“! Liberase”", 1 mg ml! collagenase, type 
4 (Roche) and 500 1g ml! collagenase D (Roche) in HBSS plus Ca?* and Mg"). 
After a brief vortex, the spleen fragments were allowed to sediment for ~3 min and 
the supernatant was transferred to another tube on ice. The sedimented (undi- 
gested) spleen fragments were subjected to a second round of digestion. The two 
fractions of digested cells were pooled and filtered through a 100-|1m nylon mesh. 
Anti-PDGFR-a (APA5), anti-PDGFR-8 (APBS5), anti-LepR (R&D), anti-CD45 
(30F-11) and anti-Ter119 antibodies were used to isolate stromal cells. For analysis 
of endothelial cells, mice were injected intravenously into the retro-orbital venous 
sinus with 10,1g Alexa-Fluor-660-conjugated anti- VE-cadherin antibody (BV13) 
10 min before being killed. Samples were analysed using a FACSAria or FACSCanto 
II flow cytometer (BD Biosciences). 

BrdU incorporation assay. To assess BrdU incorporation into spleen cells after 
EMH induction, mice were intraperitoneally injected with a single dose of BrdU 
(2mg BrdU per mouse) then maintained on 0.5 mg BrdU per ml drinking water 
for 7 days. Endothelial cells were labelled by intravenous injection of an anti-VE- 
cadherin antibody (eBioscience). Enzymatically dissociated spleen cells were 
stained with antibodies against surface markers and the target cell populations 
were sorted then resorted to ensure purity. The sorted cells were then fixed, and 
stained with an anti-BrdU antibody using the BrdU APC Flow Kit (BD Biosciences) 
according to the manufacturer’s instructions. 

Long-term competitive reconstitution assay. Adult recipient mice were irradi- 
ated using an XRAD 320 X-ray irradiator (Precision X-Ray) with two doses of 
540 rad (total 1,080 rad) delivered at least 2h apart. Cells were injected into the 
retro-orbital venous sinus of anaesthetized mice. Sorted doses of splenocytes from 
donor mice with EMH were transplanted along with 3 x 10° recipient bone marrow 
cells. Recipient mice were bled every 4 weeks to assess the level of donor-derived 
blood cells, including myeloid, B and T cells for at least 16 weeks. Blood was sub- 
jected to ammonium chloride/potassium red cell lysis before antibody staining. 
Antibodies including anti-CD45.2 (104), anti-CD45.1 (A20), anti-Gr1 (8C5), 
anti-Mac-1 (M1/70), anti-B220 (6B2) and anti-CD3 (KT31.1) were used for flow 
cytometric analysis. 


Tissue sectioning and confocal imaging. For bone marrow sections, freshly dis- 
sected bones were fixed in 4% paraformaldehyde overnight followed by 3 days of 
decalcification in 10% EDTA dissolved in PBS. Bones were sectioned using the 
CryoJane tape-transfer system (Instrumedics). For spleen sections, freshly dis- 
sected spleens were fixed in 4% paraformaldehyde for 1h followed by 1 day incuba- 
tion in 10% sucrose in PBS. Frozen spleens were sectioned with a cryostat (Leica). 
For whole mount imaging, spleens were sectioned into ~2 mm pieces. Spleen sec- 
tions were blocked in PBS with 10% horse serum for 1h and then stained overnight 
with chicken-anti-GFP (Aves) and/or rabbit-anti-laminin (Abcam) antibodies. 
Donkey-anti-chicken Alexa Fluor 488 and/or donkey-anti-rabbit Alexa Fluor 647 
were used as secondary antibodies (Invitrogen). Specimens were mounted with 
anti-fade prolong gold (Invitrogen) and images were acquired with either a Zeiss 
LSM780 confocal microscope or a Leica SP8 confocal microscope equipped with a 
resonant scanner. Three-dimensional images were achieved using Bitplane Imaris 
v.7.7.1 software. 

Deep imaging of spleens. Spleens were harvested and fixed for 4h in 4% PFA at 
4°C. Since the spleen capsule is highly autofluorescent, spleens were sectioned 
perpendicular to the long axis into 300-|:m-thick sections using a Leica VT100S 
vibrotome. These 300-j1m sections were fixed for an additional 2h in 4% PFA and 
blocked overnight in staining solution (10% dimethylsulfoxide (DMSO), 0.5% 
IgePal630 (Sigma) and 5% donkey serum (Jackson Immunoresearch) in PBS). 
All staining steps were performed in staining solution on a rotator at room tem- 
perature. Spleen sections were stained for 3 days in primary antibodies, washed 
overnight in several changes of PBS then stained for 3 days in secondary anti- 
bodies. The stained sections were dehydrated in a methanol dehydration series 
then incubated for 3h in 100% methanol with several changes. The methanol was 
then exchanged with benzyl alcohol:benzyl benzoate 1:2 mix (BABB clearing”*). 
The tissues were incubated in BABB for 3h to overnight with several exchanges 
of fresh BABB. Spleen sections were mounted in BABB between two coverslips 
and sealed with silicone (Premium waterproof silicone II clear; General Electric). 
We found it necessary to clean the BABB of peroxides (which can accumulate 
as a result of exposure to air and light) by adding 10g of activated aluminium 
oxide (Sigma) to 40 ml of BABB and rotating for at least 1h, then centrifuging at 
2,000 g for 10 min to remove the suspended aluminium oxide particles. Images were 
acquired using a Zeiss LSM780 confocal microscope with a Zeiss LD LCI Plan-Apo 
x 25/0.8 multi-immersion objective lens, which has a 570j1m working distance. 
Images were taken at 512 x 512 pixel resolution with 2|1m Z-steps, pinhole for the 
internal detector at 47.7;1m. Random spots were inserted into images by gener- 
ating randomized X, Y, and Z coordinates using the random integer generator at 
http:// www.random.org. 

Splenectomy. After mouse anaesthesia by ketamine/xylazine, a ventral midline 
incision was made and the peritoneum was breached. The splenic blood vessels 
were ligated with an absorbable suture (4-0 vicryl). The splenic vessels were cut 
distal to the suture and the spleen was removed. The vessels were cauterized 
and the abdomen was sutured with non-absorbable sutures (3-0 Tevdek III). 
Buprenorphine was administered every 12h for 3 days to minimize postoperative 
pain and mice were maintained with ampicillin-containing water to avoid infec- 
tion. Complete blood counts were measured one month after the survival surgery. 
Induction of EMH by bleeding. EMH was induced by repeated bleeding over a 
2-week period according to a published protocol’. Briefly, 4-6 month-old mice 
were bled via the tail vein five times, every 3 days, removing approximately 25011 
of blood each time, then the mice were killed for analysis 2 days after the last bleed. 
Western blot. Approximately 30,000 CD45 Ter119- VE-cadherin* splenic 
endothelial cells were flow cytometrically sorted into 50 11 of 66% trichoracetic acid 
(TCA) in water. Extracts were incubated on ice for at least 15 min and centrifuged at 
16,100 g at 4°C for 10 min. Precipitates were washed in acetone twice and the dried 
pellets were solubilized in 9 M urea, 2% Triton X-100, and 1% dithiothreitol (DTT). 
Samples were separated on 4-12% Bis-Tris polyacrylamide gels (Invitrogen) 
and transferred to PVDF membrane (Millipore). The blots were incubated 
with primary antibodies overnight at 4°C and then with secondary antibodies. 
Blots were developed with the SuperSignal West Femtochemiluminescence kit 
(Thermo Scientific). Primary antibodies used: rabbit-anti-SCF (Abcam, 1:1,000) 
and mouse-anti-actin (Santa Cruz, clone AC-15, 1:20,000). 

Quantitative real-time PCR. Cells were sorted directly into Trizol (Life 
Technologies). Total RNA was extracted according to the manufacturer’s 
instructions. Total RNA was reverse transcribed using SuperScript III Reverse 
Transcriptase (Life Technologies). Quantitative real-time PCR was performed 
using SYBR green on a LightCycler 480 (Roche). -Actin was used to normal- 
ize the RNA content of samples. Primers used in this study were Scf: 5’-GCCA 
GAAACTAGATCCTTTACTCCTGA-3’ and 5’-CATAAATGGTTTTGTG 
ACACTGACTCTG-3’; B-actin: 5’-GCTCTTTTCCAGCCTTCCTT-3’ and 
5'-CTTCTGCATCCTGTCAGCAA-3’, 


© 2015 Macmillan Publishers Limited. All rights reserved 


Gene expression profiling. Three independent samples of 5,000 spleen Scf- 
GFPtVE-cadherin™ spleen stromal cells and two independent samples of 5,000 
unfractionated spleen cells were flow cytometrically sorted into Trizol. Total RNA 
was extracted, amplified, and sense strand cDNA was generated using the Ovation 
Pico WTA System V2 (NuGEN) according to the manufacturer’s instructions. 
cDNA was fragmented and biotinylated using the Encore Biotin Module (NuGEN) 
according to the manufacturer's instructions. Labelled cDNA was hybridized to 
Affymetrix Mouse Gene ST 1.0 chips according to the manufacturer's instructions. 
Expression values for all probes were normalized and determined using the robust 
multi-array average (RMA) method”. 

Statistical methods. Panels in all figures represented multiple independent experi- 
ments performed on different days with different mice. Sample sizes were not based 
on power calculations. No randomization or blinding was performed. No animals 
were excluded from analysis. Variation is always indicated using standard devia- 
tion. For analysis of the statistical significance of differences between two groups 
we generally performed two-tailed Student's t-tests. For analysis of the statistical 
significance of differences among more than two groups, we performed repeated 
measures one-way analysis of variance (ANOVA) tests with Greenhouse-Geisser 
correction (variances between groups were not equal) and Tukey’s multiple com- 
parison tests with individual variances computed for each comparison. To assess 
the statistical significance of differences in fetal mass between paired control and 
mutant mice (Fig. 5j and Extended Data Fig. 8v), we performed a two-way ANOVA. 


26. Madisen, L. et al. A robust and high-throughput Cre reporting and 
characterization system for the whole mouse brain. Nature Neurosci. 13, 
133-140 (2010). 

27. DeFalco, J. et al. Virus-assisted mapping of neural inputs to a feeding center in 
the hypothalamus. Science 291, 2608-2613 (2001). 


28. 


29. 


30. 


31. 
32. 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


ARTICLE 


Becker, K., Jahrling, N., Saghafi, S. & Dodt, H. U. Immunostaining, dehydration, 
and clearing of mouse embryos for ultramicroscopy. Cold Spring Harb. Protoc. 
2013, 743-744 (2013). 

Irizarry, R. A. et al. Exploration, normalization, and summaries of high 

density oligonucleotide array probe level data. Biostatistics 4, 249-264 
(2003). 

Cesta, M. F. Normal structure, function, and histology of the spleen. Toxicol. 
Pathol. 34, 455-465 (2006). 

Tronche, F. et al. Disruption of the glucocorticoid receptor gene in the nervous 
system results in reduced anxiety. Nature Genet. 23, 99-103 (1999). 

Zhu, X., Bergles, D. E. & Nishiyama, A. NG2 cells generate both 
oligodendrocytes and gray matter astrocytes. Development 135, 145-157 
(2008). 

Zhu, X. et al. Age-dependent fate and lineage restriction of single NG2 cells. 
Development 138, 745-753 (2011). 

Logan, M. et al. Expression of Cre recombinase in the developing mouse limb 
bud driven by a Prxl enhancer. Genesis 33, 77-80 (2002). 

Rivers, L. E. et al. PDGFRA/NG2 glia generate myelinating oligodendrocytes 
and piriform projection neurons in adult mice. Nature Neurosci. 11, 
1392-1401 (2008). 

Cuttler, A. S. et al. Characterization of Pdgfrb-Cre transgenic mice reveals 
reduction of ROSA26 reporter activity in remodeling arteries. Genesis 49, 
673-680 (2011). 

Holtwick, R. et al. Smooth muscle-selective deletion of guanylyl cyclase-A 
prevents the acute but not chronic effects of ANP on blood pressure. Proc. Natl! 
Acad. Sci. USA 99, 7142-7147 (2002). 

Xin, H. B., Deng, K. Y., Rishniw, M., Ji, G. & Kotlikoff, M. |. Smooth muscle 
expression of Cre recombinase and eGFP in transgenic mice. Physiol. Genomics 
10, 211-215 (2002). 

Wendling, O., Bornert, J. M., Chambon, P. & Metzger, D. Efficient temporally- 
controlled targeted mutagenesis in smooth muscle cells of the adult mouse. 
Genesis 47, 14-18 (2009). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


» 


Laminin/Ter119/CD3 


a 400 um 
Sof SFP; Cxcl12 2sRe¢ + EMH, e 
P| [PE 


Qa 


GFP/DsRed/Laminin 


ee 


h Cellularity HSCs J Myeloia and erythroid K Colonyfequency | erythroid IMM Myeloid 
(x108) 20,(%10°) progenitors (x10*) 4s. (%) (x10") (x10) 
8 20 ed 10 4 12. 
i 0.35 =BFU-E 10 
Bs -EMH © "030 Mk ° 8 
34 +EMH ao AH =GEMM i a 
= 
= ots =GM 4 
3: 20 a0 1 
aos 4 2 
: ol 8 sf) sf) 3 oa ° ° 
CMP GMP MEP -EMH +EMH B Tt Scf-GFP Cxcl12-DsRed 
q r §S Lepr; R26'Tomato, SofSFF 
c 
o™ EMH a 08 € 
S 2 
% +EMH Soe. s 
- Bu : 
3 pres 5 6 
100. 2. $ 
5 Fo2 3 
c 5 2 % 
= ira 5 
Ba 0.0. a 
O “perivascular endothelial Perivascular_ endothelial £ 
Scf-GFP* Scf-GFP* ec 
t Lepree; R2G'dTomato- ScfSFP u Leprere; R26'4Tomate 
0.003%| 0.060% | 0.018% mea 006% 
= a. a 
5 it 7 
s gt & 
9 G J 5 
Ww L_———— ‘: t+ —t 
> a 
fo 0.022%| 0.004% re) 
Tomato Tomato PDGFRa PDGFRB 
Vw W . BMHSC (x10°) X 500, SP es 108) so, SP HSC ( x10°) 
s,™ my Sor’ 5 ser” Be: Sef’ 
u & 60. Bs Sef em is ‘Ser 3 x. Sef 
aE E Lepr; Sef” 3 Lepr; Sct & Lepr“; Scflv- 
kw oO 2 , 
6 E2 S 100 = 2. 
se 3 (s) io 4 
> O 20 O1 O 40, 
as 444 
0. 0 0. 0: 
NT 4d NT 4d NT 4d 


Extended Data Figure 1 | Cy+21 d G-CSF treatment induces EMH in 
the spleen and deletion of Scf from LepR* cells significantly reduces 
the number of HSCs in the bone marrow and the spleen after induction 
of EMH. a,b, Staining with anti-laminin antibody distinguished the 
vasculature of red pulp (RP) from white pulp (WP). The red pulp and 
white pulp were marked by clusters of Ter119* cells (red) and CD3* cells 
(blue), respectively*”. Dashed line depicts the boundary between red 
pulp and white pulp (representative images from 3 mice in 3 independent 
experiments). c, Spleen sections of the same magnification show 

the enlargement of the spleen after induction of EMH by Cy+21d 
G-CSF. These are the same images as in Fig. 1a,d, adjusted to reflect the 
same magnification. d, e, Imaging of thick spleen sections from Scf"?; 
Cxcl1 2°54 mice after the induction of EMH by Cy+21 d G-CSE. 

e, High-magnification view of the boxed area in d. Dashed lines depict 
the boundaries between white pulp and red pulp. Arrow indicates the 
central arteriole in the white pulp around which stromal cells expressed 
Cxcl12-DsRed (representative images from 3 mice from 3 independent 
experiments). f, g, Haematoxylin and eosin (H&E) staining showing the 
increase in haematopoiesis in the spleen after induction of EMH using 
Cy+G-CSF (+-EMH, g) as evidenced by the presence of megakaryocytes 
(arrows; n=3 mice per condition from 3 independent experiments). 
h-n, Cy+G-CSF treatment significantly increased spleen cellularity 

(h), as well as the numbers of HSCs (i), MEPs (j), frequencies of colony- 
forming progenitors (k), numbers of Ter119* erythroid cells (1) and 
Gr-1*Mac-1* myeloid cells (m) in the spleen but not the number of 
B220+ or CD3* lymphoid cells (n). The numbers of mice per treatment 
are shown in each bar of each panel. Each panel shows mean + s.d. from 
five independent experiments. 0, p, Scf-GFP (0) and Cxcl12-DsRed (p) 
fluorescence by spleen stromal cells before (— EMH) and after induction 
of EMH (+EMH) using Cy+G-CSE gq, r, The frequencies (q) and absolute 
numbers (r) of Scf-GFP*VE-cadherin* endothelial cells and Scf-GFP*VE- 
cadherin” stromal cells significantly increased upon induction of EMH 


by Cy+21 d G-CSF (+EMH). s, Spleens from Lepr’; R26! 7male, ScfCFP 
mice showed Tomato expression was primarily in the stromal cells of the 
white pulp. Although most Scf-GFP expression was in endothelial cells 
and perivascular stromal cells of the red pulp (Fig. la-d), some Scf-GFP* 
stromal cells were in the white pulp, most of which appeared to express 
LepR. Dashed line depicts the boundary between red pulp and white pulp 
(representative images of 6 mice from 4 independent experiments). 

t, Flow cytometric analysis of enzymatically dissociated spleen cells 

from Lepr“; R26'472m!0; Scf5¥P mice showed that only a small minority 
of non-endothelial Scf-GFP* cells were positive for Tomato (n =3 mice 
from 3 independent experiments). u, Tomato*CD45~ Ter119~ stromal 
cells in the spleens of Lepr-cre; R26!47a'0 mice expressed PDGFR-a, 
PDGFR-8, Sca-1 and LepR (n =3 mice from 3 independent experiments). 
v, Percentage of all CFU-F colonies formed by enzymatically dissociated 
spleen cells from Lepr“; R26'47"4° mice that expressed Tomato. 
Macrophage colonies were excluded by staining with anti-CD45 antibody 
(n=4 mice from 3 independent experiments). w, Lepr“; ScffY- mice had 
significantly fewer HSCs in the bone marrow than wild-type and Scf!”~ 
controls before induction of EMH (n=4 mice per genotype per time 
point mice from 4 independent experiments). NT, not treated. 

x, y, Lepr“; Scf!”~ mice displayed significantly lower spleen cellularity 
(x) and HSC number (y) in the spleen than wild-type and Scffv- controls 
after induction of EMH with cyclophosphamide plus 4 days of G-CSF. 
The numbers of mice per treatment are shown in each bar. Data represent 
mean +s.d. from 4 independent experiments. h-n, q, r, The statistical 
significance of differences was assessed using two-tailed Student's t-tests 
(***P < 0.001). w-y, The statistical significance of differences between 
genotypes was assessed using repeated measures one-way ANOVAs with 
Greenhouse-Geisser correction and Tukey’s multiple comparison tests 
with individual variances computed for each comparison. *P < 0.05, 

** P< 0.01, statistical significance relative to wild-type (Scf*’). +P < 0.05, 
+tP < 0.01, statistical significance between Scf/’~ and Lepr"; Scf!”~. 


© 2015 Macmillan Publishers Limited. All rights reserved 


{VE-cadherin 9 


oa Tomato/ 


wild-type, liver Scf?FP, liver 


ARTICLE 


© Tof2109€R: R2EKTEAto. SoFSFP, liver 


VE-cadherin 
Tomato 


Scf-GFP 


d Tcf21 ere/ER: R2 6td Tomato 


Tomato/ 


h Cy+G-CSF, liver 
0.06% 


Q Cy+GCSF, BM 
0% 


CD45/Ter119 
Tomato 


Tomato VE-cadherin 


Extended Data Figure 2 | Scfis expressed by most endothelial cells 

but not by Tcf21* perivascular cells in the liver; Cy-+-21 d G-CSF does 
not significantly change the recombination pattern of Tcf21-Cre/ER 

in the spleen, bone marrow or liver. a—i, To identify Cre alleles that 
recombine in spleen, but not bone marrow, stromal cells we assessed the 
gene expression profile of spleen Scf-GFP*VE-cadherin™ stromal cells 
(Extended Data Table 1). Nestin, NG2 (also known as Cspg4) and Prx1 
were low or undetectable (data not shown). Nestin-Cre*!, NG2-Cre*”, 
NG2-Cre/ER*?, and Prx1-Cre** did not recombine widely or specifically in 
Scf-GFP* stromal cells in the spleen (data not shown). Pdgfra and Pdgfrb 
were expressed by spleen Scf-GFP* stromal cells but neither Pdgfra-Cre/ER 
(ref. 35) nor Pdgfrb-Cre (ref. 36) recombined efficiently (data not shown). 
Sm22 (also known as Tagln), Myh11, Sma (also known as Acta2) and Tcf21 
were significantly more highly expressed by spleen than bone marrow 
Scf-GFP* stromal cells (Extended Data Table 1). Sm22-Cre (ref. 37), 
Myh11-Cre (ref. 38) and Sma-Cre/ER (ref. 39) recombined in few spleen 


VE-cadherin 


Tomato/ 


) 0.23% 85% 


ScfGFP 


f Cy+G-CSF, femur 
= sa é 


Tomato/ 


=~ 


Scf-GFP* stromal cells (data not shown). However, Tcf21-Cre/ER 
recombined in perivascular stromal cells in the spleen but not bone 
marrow (Fig. 2). a-c, Under normal conditions, Scf-GFP was expressed by 
most VE-cadherin* endothelial cells (arrowheads in a) but not by Tcf21* 
stromal cells (arrows in a) in the liver (1 =3 mice from 3 independent 
experiments). d, e, EMH induced by Cy+21 d G-CSF did not alter the 
general distribution (d) or perivascular localization (e) of Tomato* cells 
in the spleens of Tcf21’"8; R26'4%™!° mice as compared to normal mice 
(Fig. 2a, d). f, g, Tomato expression was undetectable in the bone marrow 
of Tof21°/F®; R26'4Tmato mice after Cy-+G-CSF treatment irrespective of 
whether the bone marrow was analysed by whole-mount imaging (f) or 
flow cytometry (g). h, i, EMH induced by Cy+G-CSF did not significantly 
change the frequency (h) or perivascular localization (i, arrows) of 
Tomato™ cells in the livers of Tef21°/#8; R26'4!mato mice. d-i, n=3 mice 
from 3 independent experiments. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


A Tef217°ER- RIGIATOMALO, Normal spleen 


Tomato/c-kit/Laminin 


Before clearing 


After clearing 


WBC (K/) Riss Rec (Mi!) 
rarer Ser 


Scf™ 


104 


wee 


BM CMP 


I 20009 PLT (K/ul) J 200 ; 103) 
number (x 


1500: * 150 
* 
} a : 


1000: 


500: 


33 55 
0 0 0. == 
NT 4 8 21 NT 4 8 21 NT 4 8 4 
K s00 BM GMP 5 | 4005 BM MEP ‘ M 45 SPGMP number ee Sham-operation: 
B 400 number (x10°) as number (x10°) 58 Scf™ 
E 300 refer Scr" 
c 200-4 pl] 
= 200 Splenectomized: 
© 100 yi “1 19°) 33 55 77 Scf™ 
0 a_i 0 Ter21""; Scr” 
n PLT (M/ul) O © WBC (K/ul) Ps; RBC (M/ul) 2000) PLT (K/ul) 
: Tef21°""; Cxcl12™ ee 1500 
15 Control 
1.0 se 4000: 
05 20 
' 6 44 
NT 21 NT 4 8 2 
P as BM CMP S 500 ~ ae t 300 Pe as SP GMP number (x10®) 
3 number (x108) 466 number (x10°) number (x103) - 
3 200 
Ee 150 300 4 
=] 
= 100 200 488 as Be ak 
O 50 100 


NT 


21 


NT 


Extended Data Figure 3 | Deep imaging of HSCs in the spleen; deletion 
of Scf or Cxcl12 from Tcf21-expressing stromal cells in the spleen 
reduced peripheral blood cell counts but did not affect bone marrow 
haematopoiesis. a, b, The vast majority of c-Kit* haematopoietic 
progenitors localized adjacent to Tcf21-expressing stromal cells in the red 
pulp of the normal spleen (n =3 mice from 3 independent experiments). 
c, d, Three-hundred-micrometre-thick sections of spleen before (c) and 
after optical clearing (d). e, f, Deep imaging of a-catulin-GFP*c-Kitt 
HSCs in cleared spleen segments from Tef21/=8; R264"; y-catulin 
mice. A representative high-magnification image of an a-catulin-GFP*c- 
Kitt HSC surrounded by Tomato* stromal cells (e). f, Low-magnification 
view of a digitally reconstructed 300-\1m-thick spleen fragment with 
a-catulin-GFP* c-Kit* HSCs identified by large yellow spheres. Note 

that actual HSCs would be smaller than the yellow spheres but would 

not be visible at this magnification (n =3 mice from 3 independent 
experiments). g-m, Tef21°/"8; Scf! and Scf/" control mice were treated 


GFP 


© 2015 Macmillan Publish 


NT 


21 


with tamoxifen then examined 1 month later without further treatment 
(not treated (NT)) or after treatment with cyclophosphamide plus 4, 8, 

or 21 days of G-CSF to induce EMH. Data show WBC (g), RBC (h) and 
PLT counts (i), numbers of CMPs (j), GMPs (k) and MEPs (1) in the bone 
marrow and numbers of GMPs in the spleen (m). n, Platelet counts of 
sham-operated and splenectomized mice that were treated with Cy+21 

d G-CSF 1 month after surgery. o-u, Tcf21°’"®; Cxcl12"~ mice and 
littermate controls (Cxcl12/”~ or Cxcl12*!~) were treated with tamoxifen 
then examined 1 month later without further treatment (NT) or after 
treatment with cyclophosphamide plus 4, 8, or 21 days of G-CSF to induce 
EMH. Data show WBC (0), RBC (p) and PLT counts (q), numbers of 
CMPs (r), GMPs (s) and MEPs (t) in the bone marrow and numbers of 
GMPs in the spleen (u). The numbers of mice per treatment are shown in 
each panel. All data reflect mean + s.d. from 3 independent experiments. 
Two-tailed Student's t-tests were used to assess statistical significance 
(*P<0.05, ***P<0.001). 


ers Limited. All rights reserved 


1.0 
08 
0.6 
0.4 
0.2 
0.0 


Normal Cy/G-CSF 


c2 
=O 
gs 
£3 
20 
os 
of 
rates 
S. .¢ 
Ao 


m Tef21P=R. Sof 


ry 
S 
i=) 


3S 


= Scf” 


i 
=} 


0.25 
0.20 
0.10 
0.05 
0.00 + 


Normal Cy/G-CSF 


< 
eS 
S 
2 
x 
xo) 
< 
= 
ie) 
s 
c 
uw 
b 
sf 
Ss 
2 
oO 
.3) 
— 
a 
>) 
ro) 
no 
© 
2 
= 
a) 
jae 


0 
Normal Cy/GCSF 


o 
—= 


ind 
BR 


Endothelial cells (%) =" 
je gues 
oO nd eS coy oa 
Perivascular cells (%) 
° ° ° ° 


Normal Cy/G-CSF Normal Cy/G-CSF 


m Cxcl1 2” m Tof21°E®: Cxci12% 


| 
Scf transcripts 
3 8 8 8 


° Normal Cy/GCSF Normal Cy/GCSF 
Endothelial cells Perivascular cells 


o 2 © & © 
bh BR OD © 


°" Normal Cy/G-CSF 


= Vav1-cre; Scf” 


Perivascular cells (%)O Endothelial cells (%) (e) 


* es t sof 
Ry <2 
_< 03 nm @ 600 
[S) wo 
~Y ao . 
=O 
1 0.2 Ss 400 
Q 
ro) 
ar) 
& © 200 
0.1 £8 
peat 
So 
Na 


° “Normal Cy/G-CSF Normal Cy/GCSF 


Extended Data Figure 4 | Conditional deletion of Scf or Cxcl12 with 
Tcf21-Cre/ER, or Scf with Vav1-Cre, does not significantly affect the 
frequency or morphology of stromal cells in the spleen, irrespective 
of EMH induction. a~g, Irrespective of whether the mice were treated 
with Cy+21 d G-CSE, conditional deletion of Scf from Tcf21* cells did 
not significantly change the frequency of VE-cadherin* endothelial cells 
(a) or PDGFR-8* perivascular stromal cells (b), Scf transcript levels in 
endothelial cells (c), or the morphology or density of blood vessels in 
the spleen (d-g). h-n, Irrespective of whether the mice were treated 
with Cy+G-CSE, conditional deletion of Cxcl12 from Tcf21* cells did 
not significantly change the frequency of VE-cadherin* endothelial cells 
(h) or PDGFR-8* perivascular stromal cells (i), Scf transcript levels in 


ARTICLE 


er 


a i“ i 
< id ‘Os ee ra 
IN Tep21°°R- Cxcl12", Cy/G-CSF 
came 


U Vav1-cre; Sct” 


a 


endothelial cells or perivascular stromal cells (j), or the morphology or 
density of blood vessels in the spleen (k-n). o-u, Irrespective of whether 
the mice were treated with Cy+-G-CSE, conditional deletion of Scf using 
Vav1-Cre did not significantly change the frequency of VE-cadherin* 
endothelial cells (0) or PDGFR-3* perivascular stromal cells (p), Scf 
transcript levels in perivascular stromal cells (q), or the morphology 

or density of blood vessels in the spleen (r-u). Scf transcript levels in 
flow cytometrically isolated cells were normalized to B-actin and then 
compared to whole spleen cells (c, j and q). The data reflect mean + s.d. 
from 3 mice per genotype per condition in 3 independent experiments. 
Two-tailed Student's t-tests were used to assess statistical significance. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


A Vav1-cre; R26tdTomato, spleen 


b 


100 » VE-cadherin* 


0.35+0.05% 9641.5% 1.140.5% a 
EE : -—————“#I ropul 
S ES 
QO 4} 8 > 
OF aay 
Tomato VE-cadherin PDGFRB 
c VE-cadherin’ cells G vE-cadherin* cells from the spleen © Vav1-cre; R26%70mat0, femur 
0.04 go oF 
0) - 1. Scft- e c 
@ 0.03 mw Scf i ay = ‘e 
<x Bf Vav1-cre; Scff/- favicon FG fg 
$ 0.02 : 26 Ss 
va anti-SCF > rom = 
E — Su BS) 
4. 0.01 eer 32 [sy 
o ANti-ACtiN ie See am iS 
0.00 0 2 400 
. - am) 
SP BM LV 121 2 il 


4 0.003% 


q 


0.006% 
A 
a B 


“19% 


Tomato 


‘0 


Vav1-cre; R26tdTomato- ScfFP’ bone marrow 


| : kL 


ite] 


0.45+0.25% 


CD45/Ter119 


VE-cadherin Scf-GFP 


Tomato/GFP/VE-cadherin 


Vav1-cre; R26tdTomato- ScfeFP liver 
78% 


j 


Tomato 


(o} 0 5%, 
VE-cadherin Scf-GFP 


Vav1-cre; R26faTomato- ScfeFP liver 


\ 
25 um 


Tomato/GFP/VE-cadherin 


Extended Data Figure 5 | Vav1-Cre recombines efficiently and 
specifically in spleen endothelial cells but poorly in bone marrow or 
liver endothelial cells. a, Tomato"£"CD45~Ter119~ cells in Vav1-cre; 
tdTomato mice were uniformly positive for VE-cadherin and negative 
for PDGFR-B (n =3 mice from 3 independent experiments). b, Vav1- 
Cre recombined in most spleen endothelial cells but in few bone marrow 
endothelial cells, irrespective of Cy+G-CSF treatment (+EMH). ¢, Scf 
transcript levels were significantly reduced in endothelial cells from 

the spleen but not from bone marrow or liver in Vav1-cre; ScflY— mice 
as compared to Scf!”~ mice. The Scf transcript level was normalized 

to B-actin. d, Western blot showed lower SCF protein levels in splenic 
endothelial cells from Vav1-cre; Scf!”~ mice as compared to Scf!"~ mice. 


© 2015 Macmillan Publi 


> 


Tomato 


ScfGFP 


x 
x ¥ L.¢ 
25 um 


Vav1-cre; R26¢dTomato- ScfFFP’ bone marrow 


ae 


25 im 


Vav1-cre; R26!dTomato- ScfeFP liver 

2644.2% 

& 

© | 

Pi eee) 

ze} 

Lavy 

2 

Ww 

> 

Scf-GFP Tomato 


x 
XK 
\ 
25 um 


SCF abundance was assessed relative to actin by Image J software (n =3 
mice per genotype from 3 independent experiments). e-h, In the bone 
marrow Vav1-Cre recombined in a minority of endothelial cells, including 
some sinusoidal (arrows in h) and some arteriolar (arrowheads in h) 
endothelial cells, that expressed little Scf-GFP by flow cytometry (f, g). 
The data reflect mean + s.d. from 3 mice per genotype in 3 independent 
experiments. i-k, Vav1-Cre recombined inefficiently in liver endothelial 
cells. Most Tomato? cells in the liver of Vav1-cre; R26'4!", ScfF? mice 
were VE-cadherin* and Scf-GFP* (i; arrows in k) but these cells accounted 
for only 26 + 4.2% of Scf-GFP* cells by flow cytometry (i, j) and confocal 
microscopy (k, n=3 mice from 3 independent experiments). Two-tailed 
Student's t-tests were used to assess statistical significance. 


shers Limited. All rights reserved 


Vav1-cre; R267mato- SofSFP after Cy+G-CSF 


ARTICLE 


@ Spleen b Spleen 


Tomato/Laminin 
Tomato/VE-cadherin 


eo) 


Femur Femur 


a 


Tomato/VE-cadherin 
Tomato/VE-cadherin 


e f Liver 
10.01% 2.0% 
= 
| oO 
= 
xe) 8 
E] if 
Ww 
2 = 
q * g 
gge Pe 49 3 
. r 8.1 Yo = 25 5m 25 um 25. Sum 
VE-cadherin 
90) wee (Kit) Ais, RBC (it) E2500 PLT (kin) sp cellularity Keo, spruscs |e, sp cmPs 
6 1s (x10°) (x108) (x108) 
P=! 
E4 400 
Cxel12"- er 5 
Cc 
Vav1-cre; Cxcl12" AM ies 200 i: 
3 é a 
0. 0. 0 
4 4 
m n Ss 
2.07SP GMPs 254 SP MEPs ©, BM caltutarity P. BM HSCs q., BM CMPs r, BM GMPs 8) BM MEPs 
ee (x108) 2.04 (x108) (x10°) (x10°) 84 (x10*) e (x105) 6 (x108) 
2 4 
e4 15. 6 4 
= 1.0 ‘ 4 
"me A“ 3 . i 
4 21 4 21 21 4 4 
Vv 
t v0 WBC (Kil) u.. RBC (Miu!) 25004 PLT (K/1!) 
Scfil* 
Scf- 4 


Vav1-cre; Scf*’- 


N 


W's) BM CMP number (x10°) Xcom 


777 666 444 999 LEE 444 


_ Cell number 
3 


BM GMP number ae a 
9 d 9 


60. 2000: 
ann 
of 333 333 333 888 tee 
1000: 
0: 
sLjen Ais Mien I ae 888 
2 


BM MEP number ae Z: 
77 666 ea a8 oe 


SP GMPs (x108) 


888 444 99 


| a 


Extended Data Figure 6 | EMH induced by Cy+G-CSF does not 
significantly change the recombination pattern of Vav1-Cre in the 
spleen, bone marrow, or liver but deletion of Scf from endothelial cells 
in spleens with EMH reduces blood cell counts without affecting bone 
marrow haematopoiesis. a, b, After EMH induced by Cy+21 d G-CSF, 
Vav1-Cre-recombined cells were predominantly in the red pulp (a) and 
co-localized with VE-cadherin* cells (b) in the spleen. c-f, After EMH 
induced by Cy+21 d G-CSF, Vav1-Cre-recombined cells remained rare 
in the bone marrow (c, d) and liver (e, f; 7 =3 mice from 3 independent 
experiments). g-s, Vav1-cre; Cxcl12/”~ mice and Cxcl12"~ controls were 
treated with Cy+4-21 d G-CSF to induce EMH. Data show WBC (g), 
RBC (h), and platelet (i) counts, spleen cellularity (j) and numbers of 
HSCs (k), CMPs (1), GMPs (m) and MEPs (n) in the spleen as well as bone 
marrow cellularity (0), and numbers of HSCs (p), CMPs (q), GMPs (r) 
and MEPs (s) in one femur and one tibia. The data represent mean 

mean + s.d. from 3 (Cy+4 d G-CSF treatment) and 5 (Cy+21 d G-CSF 
treatment) independent experiments. The number of mice per treatment 
is indicated on each bar. Two-tailed Student’s t-tests were used to assess 


400 

| 200 

oe 0. 
21 


statistical significance. t-z, Vav1-cre; a mice and ee ScflY— 
controls were treated with Cy+4-21 d G-CSF to induce EMH. Data show 
WBC (t), RBC (u), and platelet (PLT) (v) counts, numbers of CMPs (w), 
GMPs (x) and MEPs (y) in the bone marrow as well as numbers of GMPs 
in the spleen (z). Note that after 21 days of G-CSF both Scf!”~ and 
Vav1-cre; Scf/”~ mice showed significantly lower CMP numbers relative 
to Scf/’* mice but their CMP numbers were not significantly different 
from each other (w), indicating that CMP numbers in the bone marrow 
were not influenced by Scf deletion from spleen endothelial cells. The 
data represent mean + s.d. from 3 (no treatment (NT)), 3 (4 days), 

3 (8 days), and 8 (21 days) independent experiments. The number of 
mice per treatment is indicated on each bar. The statistical significance 
of differences among genotypes was assessed using repeated measures 
one-way ANOVAs with Greenhouse-Geisser correction and Tukey’s 
multiple comparison tests with individual variances computed for each 
comparison. *P< 0.05, **P<0.01, statistical significance relative to ScflY* 
controls. +P < 0.05, ttP<0.01, statistical significance between Scfl/— 

and Vav1-cre; ScflY~. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


0 


j 


J Exytroid (107) A myeloid 107) 1 Lymphoid (107) 


Sef SFP: Cxel12 °S®*4 matemal spleen 


12 Decssaese: sxemg © Cellularity (x10°) A Hscs (10°) @ Colony frequency (%) fF Myeloid and erythroid 
EROS Oy 25 0.30 sai 407 progenitors (x10) —**« 
: aes ‘ ae ** Normal ed 2000 
i See oe 2 Pregnant — a "BFU-E =| Normal 

2! E 5 0.20 =Mk am Pregnant 
os: | 2, 0.415 =GEMM | 250 
a i; é o , 0.10 =GM sat 

y O1 1 wie 


shiek 


Lai CMP 


k Scf °F? Cxci12 °S°4 matemal spleen 


MEP 


o| . 
Normal Pregnant 


7 Normal 
Normal egnant 
Pregnant , ) 
B T 


4 8 


6 


o 


2 4 


Cell number 
ScfGFP 


s 


1 2 


° 


0 0. 


17% 


n°) 
{o) 
or 
2) 
a 
N 
= 
S 
x< 
1S) 


i) 


ee 


VE-cadherin 


Cxcl12-DsRed VE-cadherin Scf-GFP 
| Sef SFP- Cxcl12 SR? matemal spleen m n 
* Normal Normal Normal 
. Pregnant Pregnant Pregnant 
S 4 Se) 

= = & se 
5 3 3” 
= E = 
Ss * 2 Zs 
3 : 3 3 
Q ‘ Oo o 
a . 
mm * " ‘9 
co) 4 of? 


s* 


Tef21°/ER- R26 ‘dTomato. hregnant 


p Spleen q Spleen 


Tomato /Laminin 
Tomato /VE-cadherin 


S Femur t Femoral BM U Liver 
0% 

® = 

2 = = 
r= 5 2 
5 a 8 
= g i 
Ss al > 
e Ss 
eg : 
Tomato ec 


Extended Data Figure 7 | Pregnancy induces EMH and the proliferation 
of endothelial cells and stromal cells in the spleen without significantly 
changing the recombination pattern of Tcf21-Cre/ER in the spleen, 
bone marrow or liver. a—v, Pregnant female mice were at gestation day 
18.5. a, b, H&E staining showed increased haematopoiesis in the spleens 
of pregnant mice (b) as evidenced by the presence of megakaryocytes 
(arrows; n = 3 mice per condition from 3 independent experiments). c-i, 
Pregnancy significantly increased spleen cellularity (c), as well as the 
numbers of HSCs (d), MEPs (e, f), Ter119* erythroid cells (g) and 
Gr-1*Mac-1* myeloid cells (h) in the spleen but not the number of B220+ 
or CD3* lymphoid cells (i). j, k, During pregnancy, Scf-GFP was expressed 
by VE-cadherin* endothelial cells and VE-cadherin” stromal cells (j) 
while Cxcl12-DsRed was expressed by a subset of the VE-cadherin” Scf- 
GFP* stromal cells (j, k). 1, Whole-mount imaging of a thick spleen section 
from a pregnant Sef”; Cxcl12>°**4 mouse (representative images from 


x x 
x x 
LS x 
et Teg Nee ti Te 


CD45/Ter119 


Tomato 


0.08% 


Tomato 


VE-cadherin 


3 mice in 3 independent experiments). m, n, In the spleen, the numbers 
of Scf-GFP* cells (m) and Cxcl12-DsRed* cells (n) significantly increased 
upon bleeding. 0, Endothelial and stromal cells in the spleen proliferated 
after bleeding. BrdU was administered to Scf*? mice or Cxcl12?°®*4 mice 
for 18 days, beginning in pregnant mice after the plug was observed. 

The number of mice per treatment is indicated on each bar. Each 

panel shows mean + s.d. from 3 independent experiments. Two-tailed 
Student's t-tests were used to assess statistical significance (**P < 0.01, 

*** P< (0,001). p—r, Pregnancy did not alter the general distribution (p), 
perivascular localization (q) or surface marker expression (r; PDGFR-8* 
and LepR-) of Tomato* cells in the spleens of Tcf21°7"®; R26'4%™#" mice. 
s, t, Tomato expression remained undetectable in the bone marrow of 
pregnant Tef21°/F8; R26'47™0 mice. u, v, During pregnancy, Tcf21-Cre/ 
ER recombined in rare perivascular cells in the liver. p-v, n =3 mice per 
genotype from 3 independent experiments. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Nennal: a sucmes Bb BMcuPs C BMMEPs Ci BMiineages 
1. Sef” 44 (x10°) 2.05 (10°) by (x10") 6,(x10") 
Pregnant: £ . - : 4 Erythroid 
2 Ser" 22 4.0 4 Myeloid 

cre, = — B 
S. Teer Sls ss : 2 : 

0. 0. 
42 3 12 3 123 12 3 


e SP GMPs f SP lineages Qwec (K/ul) h PLT (M/ul) 
(x10") 4) (10°) Erythroid 1 
*** Myeloid 


: Normal: 


1. Cxcl72” 


Pregnant: 
2. Gyeli2” 


Cell number 
nN PS Le>) 
oOo) 
* 
* 
* 
to 
= 
oO = ie) io} 


6. 
4. 
2 
3. Fef2t =" Cxcli2” il 
0. 


BM cellularity 
(x10’) 


A 20°3 2 3 12 8 
J) BMHSCs k sucmes | pumcues MM pBumers N BM lineages 
3, (10°) g, (10°) 904 (x10') gq («10°) 5, (x10") 
* 
o 
#2 iad 4 i 4 4 Erythroid 
Fa 1.0 Myeloid 
=> B 
= 2 is 2. 2. T 
0 0. 
1 3 12 3 Ii 2 3 1 22.3 12 3 
O sPcellularity P spHscs Q spcmps [FF spcmps S_ sPMEPs 
r (x10°) ie x10°) (x10*) i (x10°) (x10°) 
ie HK a 
53 i RK ke 4 RK ee 3 ' 
2 10 4 ** 10. 
=? 2. 
— 2 
5 5 5 
0. 0. 
12 3 qi: 2S 1 2 3 1 2 3 


t SP lineages u WBC (Kiul) VY 
4, (x10°) 

“* Erythroid 
3 Myeloid 
2 as B 
§2 T 
c 
01 
oO 
0 0 

12 3 12 3 


Extended Data Figure 8 | Conditional deletion of Cxcl12 from Tcf21+ 
stromal cells impairs EMH in the spleens of pregnant mice without 
significantly affecting bone marrow haematopoiesis. a—x, Four-to-six- 
month-old female mice that had been treated with tamoxifen at least 

2 months before were mated with normal wild-type males. Normal females 
and pregnant females at gestation day 18.5 were analysed. a-d, Conditional 
deletion of Scf from Tcf21* cells did not significantly affect the numbers 

of GMPs (a), CMPs (b), MEPs (c), Ter119* (erythroid), Gr-1*Mac-1* 
(myeloid), CD3* (T) and B220* (B) cells (d) in one femur or one tibia. 

e, f, Conditional deletion of Scf from Tcf21* cells significantly reduced 
GMPs (e), Ter119* erythrocytes and Gr-1*Mac-1* myeloid cells (f) in 

the spleen. g, h, Conditional deletion of Scf from Tcf21* cells did not 
significantly affect WBC (g) or platelet counts (h). i-n, Conditional 
deletion of Cxcl12 from Tef21* cells did not significantly affect bone 
marrow cellularity (i), or the numbers of HSCs (j), GMPs (k) CMPs (1), 
MEPs (m), Ter119* (erythroid), Gr-1*Mac-1* (myeloid), CD3* (T) 


12 2.0 
* o* 
: 9 aa 15 
6 1.0 
3 0.0 
0 0 
0 
1 2 3 1 2 3 


PLT (M/ul) WwW RBC (M/ul) X Fetal mass 


(9) 


2 3 


and B220* (B) cells (n) in the bone marrow. o-w, Spleen cellularity 

(o) and numbers of HSCs (p), GMPs (q), CMPs (r), MEPs (s), Ter119* 
(erythroid), Gr-1*Mac-1* (myeloid), CD3* (T) and B220* (B) cells (t) 
in the spleen and WBC (u), platelet (v) and RBC counts (w) in the blood. 
x, Conditional deletion of Cxcl12 from Tcf21* cells in the spleens of 
pregnant mothers did not significantly affect fetal mass. The numbers of 
mice per treatment are shown in each bar within each panel. Each panel 
shows mean + s.d. from 3 independent experiments. a—w, The statistical 
significance of differences among genotypes was assessed using a repeated 
measures one-way ANOVA with Greenhouse-Geisser correction along 
with Tukey’s multiple comparison tests with individual variances x, 

The statistical significance of differences was assessed using a two-way 
ANOVA. *P < 0.05, **P <0.01, ***P < 0.001, statistical significance 
relative to normal mice. +P < 0.05, ttP< 0.01, tt +P < 0.001, statistical 
significance between Scf mutant mice and control mice after bleeding. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


4 ered 


Bled 


Cell number 
ee 


0 


. © cellularity («10°) Discs (10°) e Colony frequency (%) f 


30 
Normal 


Myeloid and erythroid 


ek 
0.35 


7 see sa progenitors (x10°) *** 
025 SBFU-E soo Normal 
aMk § Bled 
0.20 = GEMM < 250: 
0.15 = 200 8 8 8 8 
=GM ® 150 
0.10 (6) 
8 oo «3 3 ee 


. ss ‘Normal Pregnant CMP GMP MEP 
Q  Enythroid (x108) h Myeloid (x10%) | Lymphoid (x108) J ScfSf; Cxci12°5°°! after bleeding kK  Scf FP: Cxcl12 °5"°4 after bleeding 
30 = 60 — 60 Normal 0.10% 0.41% 23%| 88% 
ies 26 50 50: Bled oO 
g 20 40 40: Te & 
§ 16 30 30: 9 S 
= 1.0 20 20 B S 
8 05 8 10 8 10 818 8/8 os. 3 
| 0.08% 
B T VE-cadherin Cxcl12-DsRed VE-cadherin Scf-GFP 
I scr": Cxcl12°°°*4 spleen, after bleeding 
Normal Normal Normal 
Bled Bled Bled 
= ax ss * sae 
¢ = = 10 
ic z= Ss = 80 
E A 2 2 es eek kek wee 
= € = r3) 
3 3 S + 40 
a i 52 =) 
B 3 3 ps 5 5 5 
= 12) oO ° a, 
o + 3 x x 
uu. 2 Oo 
a os é of” Ry\S oe of? O® oc x oe 
x) 
oe se ge ct 
a 
Tef21°°®; R262, after bleeding 
P spleen q Spleen f Spleen 
0.12+0.05% 
c= ro) 
g 2 = 
s 3G & 
E cS) 5 
§ Ww a 
— = 
2 S o 
: : “ 
re) S PDGFRB LepR 


S Femur t Femoral BM 

0% 
= 
£ oy o 
— ao) 
5 5 5 
= — Ww 
£ ¥ > 
Ee a re) 
[@) a= 
“ = 
Tomato 2 


400:um 


Extended Data Figure 9 | Bleeding induces EMH and the proliferation 
of endothelial cells and stromal cells in the spleen without significantly 
changing the recombination pattern of Tcf21-Cre/ER in the spleen, 
bone marrow, or liver. a, b, H&E staining showed an increase in 
haematopoiesis in the spleen after repeated bleeding (b; bled) as evidenced 
by the presence of megakaryocytes (arrows; n =3 mice per condition from 
3 independent experiments). c-i, Bleeding significantly increased spleen 
cellularity (c), as well as the numbers of HSCs (d), MEPs (e, f), and the 
numbers of Ter119* erythroid cells (g) and Gr-1*Mac-1* myeloid cells (h) 
in the spleen but not the number of B220* or CD3* lymphoid cells (i). 


j,k, After EMH induced by bleeding, Scf-GFP was expressed by VE-cadherin* 


endothelial cells and VE-cadherin stromal cells (j) while Cxcl12-DsRed 
was expressed by a subset of the VE-cadherin” Scf-GFP* stromal cells 
(j,k). 1, Whole-mount imaging of a thick spleen section from a Scf?; 
Cxcl12?54 mouse after bleeding (representative images from 3 mice in 


V Liver 


Tomato 


VE-cadherin 


3 independent experiments). m, n, The numbers of Scf-GFPt cells (m) 
and Cxcl12-DsRed* cells (n) significantly increased upon bleeding. 

o, Endothelial and stromal cells in the spleen proliferated after bleeding. 
BrdU was administered to Scf*!” mice or Cxcl12?°**" mice for 15 days 
beginning after the first bleeding. The numbers of mice per treatment 
are shown in each bar in each panel. Each panel shows mean +s.d. from 
three independent experiments. Two-tailed Student’s t-tests were used to 
assess statistical significance (**P< 0.01, ***P < 0.001). p-r, Bleeding did 
not alter the general distribution (p), perivascular localization (q) 

or surface marker expression (r; PDGFR-8* and LepR’) of Tomato* 
cells in the spleens of Tef21°"”/"8; R26!4Tmat0 mice, s, t, Tomato expression 
remained undetectable in bone marrow from Tef21°°/?8; R264 Tomato mice 
after bleeding. u, v, After bleeding, Tcf21-Cre/ER recombined only in 
rare perivascular cells in the liver (p—v; n = 3 mice per genotype from 

3 independent experiments). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Vav1-cre; R26'dTomato, after bleeding 


ARTICLE 


a Spleen 
£ 
« z 
€ 8 
3 = 
€ oO 
= 400 jum ia 
o zi 2 
© Femur f Femoral BM 9 Li i 
c JO0T% | 0.007%] < i ; 
3 3 
= Qi oI ne = 
8 e 8 
7 i= So T 
g | ie b 
Ss A oom re) 
e VE-cadherin a VE-cadherin 
i BM, GMPs BMcMPs Kk BMMEPs | ae lingaaes 
Normal 1. Scf” 5, («10°) . (x10) a5, (x10°) 
. Erythroid 
2. Sof” oo Myeloid 
3. Vav1-cre; Scf' E 
Bled 4. Tof21°°?- Sof 24 B 
5, Vav1-cre, D> 
TefoTe™- Sch“ Oo 3is 
23 2345) 
m_ WEC (Kiul) n PLT (M/ul) (e) 
15 2.0 * v 
* —_ 
T 1.5 Os 
10 2 
£ 
10 = 
= 4 
. 0.5 8 3 
; mics} 84sis APs 8 8 
12345 123465 2 3 2 5 
q M cellularity f BMHSCs S$ BM. GMPs t BMCMPs U BMMEPs 
Normal: 3, (x10) 95., (x10°) 4, 10°) ae eto") 3, X10) 
fl/- a as 
1. Cxcl12" 3 |. = | 20 7 al wes a 
Bled: 5 16. 10) 
2. Cxel12” 5 ‘ 10 2 sl | 
a. Ferre Cente” O:2} 54 ! - 
0. i) 0 a+ 0 
1 2 3 123 12 3 ' 1 2-3 
Vv BM | lineages WwW ea cellularity XX SP HSCs y. SP oo. SP CMPs 
5. (x10) (x10), 4, (104) _ (x10) 45, X10") 
5 = 
a Erythroid 
e* Myeloid 
< B 
= T 
7s 
: 1 Ai ll “ll 
0 0 ms 5 
, 123 
a SP MEPs b’ SP ee Ree (M/ul) . ee, (K/ul) Lr (M/ul) 
a5, (X10°) 4, 10°) 2 25 
= ies = Erythroid 
G20 3. Myeloid ie 20 
Eis # 5 hae e aN Te ‘6 = 
= 10 — 4 ti 1.0; 
[) 1 
O05 5 0.5} 
0 oe 0. 0 
1 2 3 1 2 3 12 123 


Extended Data Figure 10 | Blood loss does not significantly change 

the recombination pattern of Vav1-Cre in the spleen, bone marrow, 

or liver; conditional deletion of Cxcl12 from Tcf21* spleen stromal cells 
in bled mice impairs EMH in the spleen without significantly affecting 
bone marrow haematopoiesis. a—e’,Four-to-six-month-old mice with the 
indicated genetic backgrounds were repeatedly bled over a 2-week period. 
a-h, After EMH induced by blood loss, Vav1-Cre recombined efficiently 
in VE-cadherin* endothelial cells in the red pulp of the spleen (a—c) 

but poorly in the bone marrow (d-f) and liver (g, h), similar to what we 
observed under normal conditions (see Fig. 4a—c and Extended Data Fig. 5b) 
(a-h; n= 3 mice from 3 independent experiments). i-n, Conditional 
deletion of Scf using Tcf21-Cre/ER and/or Vav1-Cre did not significantly 
affect the numbers of GMPs (i), CMPs (j), MEPs (k), Ter119* (erythroid), 
Gr-1*Mac-1* (myeloid), CD3+ (T) and B220* (B) cells (1) in the bone 
marrow or WBC (m) or platelet counts in the blood (n). 0, p, Conditional 
deletion of Scf using Tcf21-Cre/ER and/or Vav1-Cre significantly reduced 
GMPs (0), Ter119* erythrocytes and Gr-1*Mac-1* myeloid cells (p) in the 
spleen. i-p, Data represent mean + s.d. from 3 independent experiments. 
q-v, Conditional deletion of Cxcl12 from Tcf21* spleen cells did not 


© 2015 Macmillan Publishers 


peer affect bone marrow cellularity (q), or the numbers of HSCs 
(r), GMPs (s) CMPs (t), MEPs (u), Ter119* (erythroid), Gr-1*Mac-1* 
(myeloid), CD3* (T) and B220* (B) cells (v) in one femur and one tibia 
from bled mice. w-b’, Conditional deletion of Cxcl12 from Tcf21* spleen 
cells significantly reduced spleen cellularity (w), and the numbers of 
MEPs (a’) and erythroid cells (b’) in the spleens of bled mice. Conditional 
deletion of Cxcl12 from Tcf21* spleen cells did not significantly affect the 
numbers of HSCs (x), GMPs (y), or CMPs (z) in the spleens of bled mice. 
c’-e’, Conditional deletion of Cxcl12 from Tcf21* spleen cells significantly 
reduced RBC (c’) but not WBC (d’) or platelet counts (e’) in the blood of 
mice that had been repeatedly bled. q-e’, Data represent mean + s.d. from 
3 independent experiments. The numbers of mice per treatment are shown 
in each bar in each panel. Statistical significance of differences among 
genotypes was assessed using a repeated measures one-way ANOVA with 
Greenhouse-Geisser correction along with Tukey’s multiple comparison 
tests with individual variances. *P < 0.05, **P< 0.01, ***P< 0.001, 
statistical significance relative to normal mice. +P < 0.05, *+P<0.01, 

tttP < 0.001, statistical significance between Scf mutant mice and control 
mice after bleeding. 


Limited. All rights reserved 


ARTICLE 


Extended Data Table 1 | Genes that are significantly more highly expressed by Scf-GFP* stromal cells in spleen as compared to bone marrow 


Spleen Scf-GFP* |BM Scf-GFP* 
Coagulation factor C homolog 
Chemokine (C-C motif) ligand 21A 
Actin, alpha 2, smooth muscle, aorta 
Chemokine (C-X-C motif) ligand 13 
Transcription factor 21 
Chloride channel calcium activated 1 

fi27I2a___|\nterferon, alpha-inducible protein 27 like 2A 11.3+0.2 
In hospholamban 
‘arm rostate androgen-regulated mucin-like 1 
ibronectin 1 

ollagen, type XIV, alpha 1 10.4+0.2 
Nuclear receptor subfamily 4, group A, 1 
Angiotensin Il receptor, type 1a 
BJ osteosarcoma oncogene 

ATPase, Na+/K+ transporter, beta 2 10.640.2 
Tenascin XB 
Myosin, heaw polypeptide 11, smooth muscle : 
Heat shook protein 1 
Chloride channel calcium activated 2 
Transgelin Mm.283283 [10.4905 7320.9 86 


Significance was considered as >8 fold and P< 0.015. Data show mean +s.d. for logs transformed expression values (n =3 independent samples per cell population). Maximal background expression 
was considered to be 6.6 (log2(100)); all expression values below this threshold were set to 6.6 for purposes of calculating fold change. Two-tailed Student’s t-tests were used to assess statistical 
significance. Data for bone marrow Scf-GFP* stromal cells are from ref. 19. 


[o>) 


Acta2 


@ 
@ 
= 
o) 
SD 
i) 
=| 
ro) 
ro) 


[op) 


VIS ~ 
s13l8ls|8/8 
9 1SI]LS SI 
a es ee ~ 
o> ry) 


U 
vU 


[o>) 


nt 


Nr4a1 

Agtria 
Os 

Atp1b2 


Ud 


h11 
Hspb1 


Nr4at__|Nuclear receptor subfamily 4, group A. 1 | 
Agtria__|Angiotensin ilreceptor, type ta 
Atp1b2_|ATPase, Nat/K+ transporter, beta2_ | 
[Tnxb _[TenascinXB 
Myht1__|Myosin, heavy polypeptide 11, smooth muscle _| 
Hspb1__|Heat shock protein 
[Clca2__|Chloride channel calcium activated 2 
agin |Transgelin, 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature15748 


Epithelial-to-mesenchymal transition 
is not required for lung metastasis but 
contributes to chemoresistance 


KariR. Fischer!?,*-4, Anna Durrans!*, Sharrell Lee!, Jianting Sheng®, Fuhai Li>, Stephen T. C. Wong®®, Hyejin Choi>?+4, 
Tina El Rayes)?*+, Seongho Ryu)’, Juliane Troeger®’, Robert F. Schwabe®’, Linda T. Vahdat!, Nasser K. Altorki!’, 


Vivek Mittal’? & Dingcheng Gao!?? 


The role of epithelial-to- mesenchymal transition (EMT) in metastasis is a longstanding source of debate, largely owing 
to an inability to monitor transient and reversible EMT phenotypes in vivo. Here we establish an EMT lineage-tracing 
system to monitor this process in mice, using a mesenchymal-specific Cre-mediated fluorescent marker switch system 
in spontaneous breast-to-lung metastasis models. We show that within a predominantly epithelial primary tumour, a 
small proportion of tumour cells undergo EMT. Notably, lung metastases mainly consist of non-EMT tumour cells that 
maintain their epithelial phenotype. Inhibiting EMT by overexpressing the microRNA miR-200 does not affect lung 
metastasis development. However, EMT cells significantly contribute to recurrent lung metastasis formation after 
chemotherapy. These cells survived cyclophosphamide treatment owing to reduced proliferation, apoptotic tolerance 
and increased expression of chemoresistance-related genes. Overexpression of miR-200 abrogated this resistance. 
This study suggests the potential of an EMT-targeting strategy, in conjunction with conventional chemotherapies, for 


breast cancer treatment. 


Despite significant advances in diagnosing and treating cancer, metas- 
tasis persists as a barrier to successful therapy and the main cause of 
cancer-related death!. The EMT, wherein epithelial cells depolarize, 
lose their cell-cell contacts, and gain an elongated, fibroblast-like 
morphology, is a potential mechanism by which tumour cells gain 
metastatic features. Functional implications of EMT include enhanced 
mobility, invasion and resistance to apoptotic stimuli??. Moreover, 
through EMT tumour cells acquire cancer stem cell, secondary 
tumour-initiating and chemoresistance properties*°. However, the 
importance of EMT in vivo is fiercely debated owing to major chal- 
lenges. Mesenchymal tumour cells cannot easily be distinguished from 
neighbouring stromal cells, and metastatic lesions mostly exhibit epi- 
thelial phenotypes’. The latter may be due to the hypothesized reverse 
process, mesenchymal to epithelial transition (MET), of the dissem- 
inated tumour cells. Studies have confirmed that mesenchymal cells 
are more capable of escaping the primary tumour, and of reaching 
distant sites, but it remains unproven that those same cells complete 
the full metastatic cascade in the form of a secondary nodule. Without 
evidence for the dissemination, colonization and metastatic outgrowth 
of mesenchymal tumour cells, the role of EMT will remain contested. 
In this study, we employed multiple transgenic mouse models, estab- 
lishing a cell lineage tracing approach together with characterization 
of epithelial and mesenchymal markers, to address the requirement 
of EMT in metastasis. The newly established transgenic model also 
provided us a unique opportunity to study the contribution of EMT 
to chemoresistance. 


EMT lineage tracing during metastasis 

To track EMT during metastasis in vivo, we generated a mesen- 
chymal-specific, Cre-mediated fluorescent marker switch strategy 
and established a triple-transgenic mouse model (MMTV-PyMT/ 
Rosa26-RFP-GFP/Fsp1-cre, tri-PyMT, Fig. 1a). In these mice, spon- 
taneous multifocal breast adenocarcinomas with distinct epithe- 
lial characteristics resembling the human luminal subtype develop 
in the mammary glands, and give rise to lung metastases with high 
penetrance®”. The Fsp1 (fibroblast specific protein 1) promoter drives 
expression of Cre recombinase in cells of mesenchymal lineage’”. 
A Cre-switchable fluorescent marker (lox-RFP-STOP-lox-GFP) is 
ubiquitously expressed under the control of the B-actin promoter 
in the Rosa26 locus!!. Fsp1 is the critical gatekeeping gene of EMT 
initiation”, and its early activation in this process? allows for lin- 
eage tracing of tumour cells that have undergone EMT in vivo. 
Importantly, the colour switch system is irreversible—even if the 
mesenchymal tumour cells undergo MET in the metastatic organs", 
they would remain GFP*. 

Primary breast tumours developed in the tri-PyMT mice at 8 weeks 
of age. Immunofluorescence revealed that the majority of tumour cells, 
identified by PyMT oncogene expression, were RFP positive (Fig. 1b). 
These cells expressed E-cadherin and lacked vimentin (Extended Data 
Fig. 1a), indicating their epithelial phenotype. The GFP* cells detected 
in the tumour bed were largely haematopoietic cells as they are PPMT 
negative and express CD45, a pan-haematopoietic marker (Extended 
Data Fig. 1a), which is consistent with previous reports!® . Altogether, 


1Department of Cardiothoracic Surgery, Weill Cornell Medical College of Cornell University, 1300 York Avenue, New York, New York 10065, USA. Department of Cell and Developmental Biology, 
Weill Cornell Medical College of Cornell University, 1300 York Avenue, New York, New York 10065, USA. Neuberger Berman Lung Cancer Center, Weill Cornell Medical College of Cornell University, 
1300 York Avenue, New York, New York 10065, USA. ‘Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medical College of Cornell University, 1300 York Avenue, New York, 

New York 10065, USA. 5Department of Systems Medicine and Bioengineering, Houston Methodist Research Institute, Houston Methodist Hospital, Houston, Texas 77030, USA. ®Methodist 

Cancer Center, Houston Methodist Hospital, Houston, Texas, 77030 USA. ’Soonchunhyang Institute of Medi-bio Science (SIMS), Soonchunhyang University, 25 Bongjeong-ro Cheonan-Si, 
Chungcheongnam-do 31151, South Korea. ®Department of Medicine, Columbia University, College of Physicians and Surgeons, New York, New York 10032, USA. °Institute of Human Nutrition, 
Columbia University, New York, New York 10032, USA. !°Department of Medicine, Weill Cornell Medical College of Cornell University, 1300 York Avenue, New York, New York 10065, USA. 


472 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


MMTV 


ae PyMT or Neu 


RFP* 


GFP* 


SSS 

EMT Sy 

3 224 
S 


Mesenchymal 


dnouuny Are 


= 
c 
=i 

a 
a 
Q 
o 
a 
@ 
o 
o 
o 
o 


Figure 1 | Establishing an EMT lineage tracing system in triple-transgenic 
mice. a, Schematic of triple-transgenic mice carrying polyoma middle-T 
(PyMT) or Neu oncogenes driven by the MMTV promoter, Cre recombinase 
under the control of the Fsp1 promoter, and floxed RFP-STOP followed 

by GFP under control of the B-actin promoter in the Rosa26 locus. RFP* 
epithelial tumour cells undergoing EMT permanently convert into GEP* 
cells following activation of Fsp1—Cre. b, c, Immunofluorescent microscopy 
images of tri-PyMT primary tumours (b) and lung metastases (met; c) (>10 
sections from 3 mice), depicting RFP* and GFP* cells within the tumour 
bed, and staining (white, pseudo-coloured) for PyMT. Scale bars, 100 um. 


this data suggests that tumour cells maintain their original RFP expres- 
sion and epithelial phenotype in the primary tumour. 

Lung metastasis developed spontaneously in tri-PyMT lungs at 12 
weeks of age. Surprisingly, the PyMT-positive metastatic lesions were 
RFP* (Fig. 1c), and epithelial (E-cadherin */vimentin— ) (Extended Data 
Fig. 1b), whereas only non-tumour cells expressed GFP. These results 
indicate that tumour cells did not activate the mesenchymal-specific 
Fsp1 promoter, and retained their epithelial phenotype during metastasis. 
Thus, tumour cells may not undergo EMT to form metastatic lesions. 


Lineage tracing in additional models 

To exclude the possibility that the absence of EMT in metastasis may be 
unique to PyMT-driven breast tumours, we established EMT lineage 
tracing in the Neu oncogene-driven'® spontaneous breast cancer model 
(MMT V-neu/Rosa26-RFP-GFP/Fsp 1-Cre, tri-Neu mouse). The Neu 
(ErbB-2) proto-oncogene is associated with 20-30% of human breast 
cancers, and MMTV-Neu transgenic mice spontaneously develop 
focal adenocarcinomas resembling human luminal phenotypes after 
an extended latency at 6-8 months of age. Lung metastases are fre- 
quently (72%) observed in these transgenic mice at 9-12 months of age. 
Mirroring the tri-PyMT model, the Neu* tumour cells in both primary 
and metastatic lesions in tri-Neu mice were also RFP* and epithelial 
(E-cad*/Vim_ ) (Extended Data Fig. 2). Therefore, the absence of EMT 
during metastasis formation is an oncogene-independent phenome- 
non, manifesting in both PyMT and Neu-driven tumours. 

To overcome the limitation of using solely Fsp1—Cre to indicate EMT, 
we acquired the vimentin-CreER transgenic mouse, which successfully 
traced mesenchymal lineage cells during liver fibrosis'”, and gener- 
ated an additional EMT lineage tracing model (tri-PyMT/Vim mice, 
MMTV-PyMT/Rosa26-RFP-GFP/Vimentin-creER). After continuous 
induction of Cre activity by Tamoxifen injection (2 mg, intraperito- 
neal, three times per week starting when the primary tumours appear 
at 8 weeks of age) the majority of tumour cells in both the primary 
and metastatic lesions in tri-PyMT/Vim mice were RFPt (Extended 
Data Fig. 3)—suggesting an absence of vimentin promoter activation 


ARTICLE 


Sorted RFP* tri-PyMT cells 


Primary tumour P41 


108 tog | ore. 


P10 + 10% FBS ”®, », 
» 45.6% 


GFP 


ak 
a 


Gene expression 
(relative to Gapdh) 
3 


Figure 2 | The EMT lineage tracing system reports EMT in tumour cells 
with high fidelity. a, Scatter plots from flow cytometry analysis of tri-PyMT 
primary tumour cells, depicting GFP* and RFP* populations in the primary 
tumour immediately after sorting of RFP* cells (P1), and after ten passages 

in culture with 10% FBS (P10 + 10% FBS). Numbers indicate the percentage 
of RFP* and GFP* cells in the total population. b, Phase contrast/fluorescent 
overlay image of tri-PyMT cells in culture. Scale bar, 50 um. c, Western blot of 
sorted RFP* and GFP* tri-PyMT cells for E-cadherin, vimentin and B-actin as 
a loading control. Representative of two individual experiments. For original 
gel images, see Supplementary Fig. 1. d, Representative imaging of GFP* and 
REFP* tumour cells in primary tumours (PT) and lung metastases (LM) in the 
orthotopic model (n= 8 mice). Arrow indicates scattered GFP* EMT tumour 
cells in the primary tumour. Scale bars, 100 um (PT) and 50 um (LM). 

e, (RT-PCR analysis of relative expression of EMT markers in RFP* and GFPT 
cells sorted from orthotopic tri-PyMT primary tumours. Gapdh served as the 
internal control. E-cadherin is encoded by the Cdh1 gene. Occludin is encoded 
by the Ocln gene. Data are reported as mean + s.e.m., n= 4 primary tumours. 


during lung metastasis formation. EMT marker staining also revealed 
the epithelial phenotype (E-cad*/Vim_ ) of the tumour cells in both 
primary and metastatic lesions (Extended Data Fig. 3). 

Together, results from two oncogene-driven metastatic tumour mod- 
els (MMTV-PyMT and MMTV-Neu) and two independent mesen- 
chymal-specific reporters (Vim-Cre and Fsp1-Cre) suggest that EMT 
does not significantly contribute to the development of lung metastases. 


Validating EMT lineage tracing 

To evaluate the specificity and sensitivity of the EMT lineage tracing 
system, we established a cell line from the tri-PyMT breast tumours. In 
culture, RFP* tri-PyMT cells switched their fluorescent marker expres- 
sion to GFP, as indicated by the presence of a RFPt/GFP* double- 
positive transitioning population (Fig. 2a). The cells were cultured in 
10% FBS, and serum is known to be enriched for many EMT promot- 
ing factors including TGFs'*. Moreover, addition of TGF-f 1 in low- 
serum conditions (2% FBS), yielded an increase in GFP* cells 
(Extended Data Fig. 4a). In concert with the fluorescent marker switch, 
tri-PyMT cells changed their morphology from cobblestone-like clus- 
ters of epithelial cells to dispersed spindle-shaped mesenchymal cells 
(Fig. 2b). Reflecting the morphologic differences, the GFP* cells were 
more motile than RFP* cells (Extended Data Fig. 4b). 

The fidelity of the EMT lineage tracing system was confirmed by 
analysis of EMT marker expression in sorted RFP* and GFP* tri- 
PyMT cells. RFP* cells expressed elevated levels of epithelial markers 
including E-cadherin and Occludin, while GFP* cells expressed 


26 NOVEMBER 2015 | VOL 527 | NATURE | 473 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


several mesenchymal markers including vimentin, FSP1, Twist, Zeb1 
and Zeb2 as determined by quantitative reverse transcription PCR 
(qRT-PCR) (Extended Data Fig. 4c). Both RFP* and GFP* tri-PyMT 
cells expressed the PyMT oncogene. Consistently, western blot analysis 
confirmed the differential expression of E-cadherin and vimentin in 
RFP* and GFP* cells (Fig. 2c). Flow cytometry for E-cadherin revealed 
that the majority of E-cadherin” cells were GFP* (97.4%) (Extended 
Data Fig. 4d). Of note, the E-cadherin* cells were either RFP* (93.6%) 
or RFP*/GFP* (6.0%), demonstrating that tumour cells switch their 
fluorescent marker expression before the loss of epithelial markers, and 
validating the early reporting of EMT in our system. These results con- 
firm that the Fsp1-Cre-mediated fluorescent marker switch in tumour 
cells reports EMT with high fidelity and efficiency. 


Rare EMT events in tumour progression 

In the triple-transgenic models, ubiquitous expression of GFP in the 
tumour microevironment precluded detection of potentially rare GFP* 
tumour cells. To confine the fluorescence to tumour cells, we estab- 
lished an orthotopic model by implanting purified RFP* tri-PyMT 
cells in wild-type mice (Extended Data Fig. 5a, b). Consistent with 
observations in the triple-transgenic mice, primary tumours contained 
REP* epithelial cells (Fig. 2d). However, GFP* cells were detected, indi- 
cating tumour EMT (Fig. 2d and Extended Data Fig. 5c, upper panel). 
These cells lacked E-cadherin (Extended Data Fig. 5c, upper panel) 
and made up 1.98 + 1.40% (n =6) of the total tumour cells (Extended 
Data Fig. 5d). qRT-PCR analysis of EMT markers comparing sorted 
REP*t and GFP* cells from the same primary tumour confirmed the 
mesenchymal phenotype of the GFP* cells (Fig. 2e). Importantly, these 
GFP* EMT tumour cells did not contribute to lung metastasis. Early 
disseminated tumour cells detected in the lungs were epithelial and 
RFP* (Extended Data Fig. 5c, middle panel), and 28 lung nodules 
detected in 8 mice maintained the epithelial phenotype (Fig. 2d and 
Extended Data Fig. 5c, lower panel). 

We also established an orthotopic tri-PyMT/Vim model, wherein 
Tamoxifen was administered directly after orthotopic injection to 
ensure immediate tracing of EMT events. Consistently, the majority of 
tumour cells in both primary and metastatic tumours were RFP* and 
epithelial (Extended Data Fig. 6). Again, GFP* EMT events (4.46 + 1.0% 
of total tumour cells, n = 3) were detected in the primary tumours. 

To further dissect the metastatic cascade, we quantified the relative 
numbers of REP* and GFP* cells in the primary tumour, blood and 
metastases of the tri-PyMT orthotopic model by flow cytometry. An 
RFP to GFP ratio of ~100:1 in the primary tumour and ~15:1 in the 
blood was observed (Extended Data Fig. 7a, b). However, gain by the 
enrichment of GFP* cells in circulation did not translate to an advan- 
tage in metastatic outgrowth, as the RFP:GFP ratio in the lung was 
~150:1. Altogether, these findings are consistent with our observa- 
tions in the triple-transgenic models, suggesting that the majority of 
breast tumour cells persist in an epithelial state during primary tumour 
growth and lung metastasis formation. 


EMT inhibition and metastasis formation 

In spite of the extensive characterization of the EMT reporter system, 
there was still the distant possibility of our reporter failing to mani- 
fest all EMT events in vivo. Therefore, we sought to inhibit EMT and 
determine its impact on metastasis. We ectopically expressed miR-200, 
a well-known inhibitor of EMT that directly targets Zeb1 and Zeb2— 
the transcriptional repressors of E-cadherin'®”. We posited that sta- 
bly expressing miR-200 in tri-PyMT cells would block EMT and trap 
tumour cells in a permanent epithelial state. Compared with control 
cells, miR-200 overexpressing cells (Extended Data Fig. 7c) showed 
elevated expression of epithelial cell markers and reduced expression 
of mesenchymal markers (Extended Data Fig. 7d). As expected, over- 
expression of miR-200 inhibited the RFP to GFP conversion (>90% 
remaining RFP‘, Fig. 3a). These results substantiate effective miR-200 
suppression of EMT in the tri-PyMT cells. 


474 | NATURE | VOL 527 | 26 NOVEMBER 2015 


a Control miR-200 


10° 3| RFP 


8.9%, 


92.2% . 


TTT TT TTT —T 
107 108 10* 106 


"JO pg 0 
‘ GFP 

Lung metastases ° 3 6 
Control miR-200 g 8 
S 4 
o 3 

(= 
5S 2 
4 
20 

Control miR-200 


Figure 3 | mir-200 inhibition of EMT in tri-PyMT cells did not impact 
lung metastasis. a, Flow cytometry analysis of tri-PyMT control and 
mir-200-expressing cells, indicating the percentage of RFP* and GFP* cells. 
b, Representative histologic lung images in tri-PyMT control and mir-200- 
expressing orthotopic mice (n=5). Scale bar, 1.5mm. ¢, Quantification 

of lung metastasis formation (number of individual nodules) in tri-PyMT 
control and mir-200-expressing tumour-bearing mice (n= 5). Data reported 
as the mean + s.e.m. 


To explore the impact of inhibiting EMT on metastasis formation 
in vivo, we orthotopically injected miR-200 overexpressing tri-PyMT 
cells. We identified 18 metastases in 5 mice, a similar ratio to that 
observed in mice bearing control tri-PyMT cells (28 metastases in 
8 mice) (Fig. 3b, c). These results demonstrate that inhibition of EMT 
by miR-200 overexpression does not impair the ability of tumour cells 
to form distant lung metastases. 


EMT is involved in chemoresistance 

Emerging evidence suggests a molecular and phenotypic associ- 
ation between EMT and chemoresistance in several cancers*~”?. 
Compellingly, residual breast cancers following chemotherapy display 
a mesenchymal phenotype and tumour-initiating features”*. To deter- 
mine if the acquisition of chemoresistance induces specific molecular 
changes consistent with EMT, we evaluated the orthotopic tri-PyMT 
model under chemotherapy. Animals with established primary tumours 
were treated with cyclophosphamide (CTX), a commonly used drug in 
breast cancer treatment?* (100 mg kg}, once per week, for two weeks 
prior, and two weeks after, surgery; Fig. 4a). The tumours responded to 
chemotherapy, manifesting a 60% reduction in growth and markedly 
enhanced apoptotic activity (Extended Data Fig. 8a—c). Of note, the 
REP* cells were highly proliferative and apoptotic in comparison with 
GFP * cells in CTX-treated mice (Extended Data Fig. 8d-g), suggesting 
that GFP* cells have reduced susceptibility to chemotherapy. However, 
in the primary tumour, the GFP* cell percentage remained static under 
CTX treatment (Extended Data Fig. 8h). 

Remarkably, in the early metastatic lungs (four weeks after tumour 
inoculation), flow cytometry analysis revealed a 2.7:1 ratio of GFP* to 
RFP* cells in CT'X-treated mice (Fig. 4b). Subsequently at four weeks 
after cessation of treatment, a notable contribution of GFP* tumour 
cells was detected in 5 out of 17 metastatic lesions (Fig. 4a). This is in 
contrast to untreated mice, where all metastatic lesions were derived 
from RFP* cells (Fig. 2d), suggesting that the EMT process may be 
involved in metastatic outgrowth in the context of chemotherapy. 

To evaluate the effects of CTX on the EMT and non-EMT cell 
populations, sorted GFP* and RFP* cells were incubated with CTX 
in vitro—the GFP* cells were markedly more resistant to both short- and 
long-term treatment (Fig. 4c and Extended Data Fig. 9a, b). The selective 
advantage of mesenchymal tumour cells in the context of chemotherapy 


© 2015 Macmillan Publishers Limited. All rights reserved 


Lung metastasis 
(GFP/RFP/DAPI) 


a PT removal 4 


Orthotopic > 1 a weeks 
injection yy weeks 
ee i 


RFP* 
tri-PyMT cells 
b * d 
7) #0 Pre-injection 
8 3.0 wd ons 
7 48.4% | 5 20g ail vein 
t 2.0 a|4 2 ade | injection oo 
eo Ef ej oe ee 
pets os | andi. aw, 
5 | Od ars 
0 . 
Control CTX ad 45.7% A A A 
“370 10%) 10°10" 108 CTX treatment 
GFP 
c e f 
107 m RFP - 20 * 
<= 84 mGFP 3 30 
= - 25 215 
26 3g a." 
S - i 
3 S 45 & 1.0 
2 2 10 j 
< 2 0.5 
o+| = : : S 5 * ) o 
0 é 8 ae o- —— 0 T 1 
CTX (UM) Control CTX Control CTX 


Figure 4 | EMT tumour cells are resistant to chemotherapy. a, Schema 
of CTX treatment in tri-PyMT orthotopic model. Mice bearing an RFP* 
primary tumour were treated with CTX (100 mg kg“, once per week, for 
4 weeks, as indicated by blue arrows). After 2 weeks of treatment, primary 
tumour (PT) was removed (black arrow). Lung metastasis growth was 
permitted for 4 weeks post CTX treatment. Fluorescent imaging of lungs 
revealed the contribution of GFP* tumour cells to lung metastases (n = 9 
mice). b, Ratio of GFPt to RFP* cells in early metastatic lungs (4 weeks 
post orthotopic injection) of untreated control and CTX-treated mice 

as quantified by flow cytometry (n = 4, *P < 0.05). Data reported as the 
mean + s.e.m. ¢, Apoptosis (as measured by Annexin binding) of REP* 
and GFP* tri-PyMT cells treated with CTX (n =2 biological replicates). 
d, Flow cytometry scatter plot showing the proportions of RFP* and 
GFP* tri-PyMT cells before intravenous injection. Mice were treated 
with CTX (100 mg kg! per week for 3 weeks, n=5 mice per group). 

e, Quantification of flow cytometry data showing the percentage of RFP* 
and GFP* tumour cells (red and green bars, respectively) of total cells in 
the lung of control and CTX-treated mice (n=5 mice per group, *P < 0.05). 
f, Quantification of flow cytometry data showing the ratio of GFP to 
REP* cells in lungs of control and CTX-treated mice. Black line represents 
the starting ratio of GEP* to RFP* cells before injection as derived from 
the data in Fig. 4d (*P <0.05). Data reported as the mean + s.e.m. 


was then corroborated by a competitive survival assay in vivo (Fig. 4d). 
Mice were injected intravenously with an equivalent number of RFPT 
and GFP* cells, and immediately received CTX (100 mg kg™', once per 
week). After three weeks, lungs were harvested and the ratio of RFP 
and GFP* cells was assessed by flow cytometry. CTX significantly 
inhibited outgrowth of lung metastasis from both RFP* and GFP* cells 
(Fig. 4e). The untreated lungs were morbidly overwhelmed with 
tumours, with nearly 80% of the tumour cells detected as REP™. 
Conversely, in CT X-treated mice, more than 60% of the surviving 
tumour cells were GFPt, producing a significantly higher ratio of 
GFP:RFP cells in these mice (Fig. 4f). These results indicate that GFP~ 
EMT cells are more resistant to chemotherapy both in vitro and in vivo. 

Immunostaining revealed that in the untreated mice, both RFP~ 
and GFP* cells formed epithelial metastatic lesions (E-cad*/Vim— ) 
(Extended Data Fig. 9c). Given the initial mesenchymal phenotypes 
of GFP* cells before injection, this suggests that the GFP* tumour 
cells have undergone MET in the metastatic organ. On the other hand, 
in CTX-treated mice the majority of surviving tumour cells were 
scattered mesenchymal GFP* cells (E-cad~/Vim*) (Extended Data 
Fig. 9d). Together, these observations suggest that EMT tumour cells 
that sustain a mesenchymal phenotype are resistant to chemotherapy. 

To begin to investigate the molecular underpinnings of mesenchy- 
mal tumour cell resistance, we analysed the transcriptomic changes of 


ARTICLE 


a 1.25 
@ 
Zz 
aS 
5 0.84 
3 = Control 
o 
cs) —™— miR-200 
£044 
ic 
© 
c 
0+ T T y } 
0 5 10 15 20 
CTX (uM) 
b Lung metastases . 2.05 
Control miR-200 2 
Q 1.54 
a ; 2 oi 
2 
. ? 5 0.54 
fe) 
= 0 
Control miR-200 


Figure 5 | miR-200 overexpression abrogates CTX resistance. 

a, Sensitivity of Control and miR-200-expressing tri-PyMT tumour cells 

to CTX treatment as measured by CellTiter-Glo. n = 4 biological replicates 
per condition b, Representative histologic lung images in tri-PyMT control 
and mir-200-expressing tumour-bearing mice treated with CTX (n=5). 
Scale bar, 1.5mm. c, Quantification of lung metastasis formation (number 
of individual nodules) in CT X-treated tri-PyMT control and mir-200- 
expressing tumour-bearing mice (n= 5). Data reported as the mean + s.e.m. 


EMT tumour cells. We sorted RFP* and GFP? cells and performed 
RNA-sequencing analysis (Supplementary Information Table 1). In 
addition to the expected changes in EMT marker expression (Extended 
Data Fig. 10a), the expression of many cell-proliferation-related 
genes was reduced in GFP* cells (Extended Data Fig. 10b), mirror- 
ing their phenotype of reduced proliferation in vivo. The GFP* cells 
also showed increased expression of proven chemoresistance-related 
factors including IL6, Periostin, Enpp2 and Pdgfr?***. Additionally, the 
CTX-treated GFP* cells elevated their expression of many drug- 
metabolizing enzymes including drug transporters (Abcbla, Abcb1b 
and Abccl), aldehyde dehydrogenases (ALDHs), cytochrome P450s, 
and glutathione-metabolism-related enzymes (Extended Data Fig. 10c). 
The main toxicity of CTX is due to its metabolite phosphoramide mus- 
tard, which is only formed in cells with low levels of ALDHs. ALDH 
converts the CT X-metabolite aldophosphamide into the non-toxic 
carboxyphosphamide”’. In accordance with the transcriptomic data, 
GFP* cells had significantly higher ALDH activity compared with 
RFP* cells (Extended Data Fig. 10d). These properties of reduced pro- 
liferation, increased apoptotic resistance, and upregulation of chemore- 
sistance and drug metabolizing genes in GFP* EMT tumour cells may 
contribute to their insensitivity to CTX. Notably, GFP* cells were also 
refractory to other commonly used chemotherapies including doxoru- 
bicin, paclitaxel, and fluorouracil treatment (Extended Data Fig. 10e). 

To demonstrate that the EMT is required for the generation of CTX 
resistance, we first tested in vitro the effect of treatment on control 
and miR-200 overexpressing tri-PyMT cells. With increasing concen- 
trations of CTX, the miR-200 cells were significantly more suscepti- 
ble to therapy (Fig. 5a). We then expanded upon this finding in vivo, 
establishing orthotopic control and miR-200 primary tumours, and 
applying the pre- and post-surgery CTX regimen. We found that by 
blocking EMT in tumour cells, we effectively ablated metastatic growth 
(Fig. 5b, c). Thus, EMT contributes to the development of chemo- 
resistant metastasis. 


Discussion 

Using two independent EMT lineage tracing strategies in two dis- 
parate oncogene-driven autochthonous models of breast cancer, we 
demonstrated that lung metastases are derived from non-EMT tumour 


26 NOVEMBER 2015 | VOL 527 | NATURE | 475 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


cells, contradicting the original EMT/MET hypothesis**”. In a tracing 
system similar to our own, EMT was identified in primary tumours, 
but the mesenchymal lineage status of the metastatic nodules was 
not pursued*!. Ultimately in our models we found that tumour cells 
disseminate and form metastases while persisting in their epithelial 
phenotype, in accordance with a recent study*”. To underline that 
EMT is not required for metastasis, overexpression of miR-200—a 
microRNA that is incongruously associated with both reduced 
invasion!*”° and increased metastasis**—resulted in combined sup- 
pression of the EMT-promoting transcription factors Snail1/2, Twist, 
Zeb1 and Zeb2, but had no effect on metastasis. Given that both epithe- 
lial and mesenchymal tumour cells have the potential to disseminate, 
it is plausible that the larger fraction of highly proliferative epithelial 
cells outcompete the minor EMT tumour cell population in generating 
macrometastatic lesions. 

Until now, the majority of data connecting EMT with chemore- 
sistance was largely derived from in vitro studies, or clinical prognostic 
data. Here we demonstrate that highly proliferative non-EMT cells are 
sensitive to chemotherapy, and observe the emergence of recurrent 
EMT-derived metastases after treatment. There is a great emphasis 
towards developing EMT-targeting therapies**”*, and our studies sug- 
gest that while EMT blockade may not affect metastasis formation, 
specifically targeting EMT tumour cells will be synergistic with conven- 
tional chemotherapy. Thus, our EMT lineage tracing system provides a 
unique preclinical platform to develop combination therapies that will 
eliminate both populations, and combat chemoresistance. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 17 August 2014; accepted 23 September 2015. 
Published online 11 November 2015. 


1. Bastid, J. EMT in carcinoma progression and dissemination: facts, unanswered 
questions, and clinical considerations. Cancer Metastasis Rev. 31, 277-283 
(2012). 

2. Kalluri, R. & Weinberg, R. A. The basics of epithelial-mesenchymal transition. 
J. Clin. Invest. 119, 1420-1428 (2009). 

3. Scheel, C. & Weinberg, R. A. Phenotypic plasticity and epithelial-mesenchymal 
transitions in cancer and normal stem cells? /nt. J. Cancer 129, 2310-2314 
(2011). 

4. Mani, S.A. et al. The epithelial-mesenchymal transition generates cells with 
properties of stem cells. Ce// 133, 704-715 (2008). 

5. Gal, A. et al. Sustained TGF beta exposure suppresses Smad and non-Smad 
signalling in mammary epithelial cells, leading to EMT and inhibition of growth 
arrest and apoptosis. Oncogene 27, 1218-1230 (2008). 

6. Hennessy, B. T. et al. Characterization of a naturally occurring breast cancer 
subset enriched in epithelial-to-mesenchymal transition and stem cell 
characteristics. Cancer Res. 69, 4116-4124 (2009). 

7. Gao, D. et al. Myeloid progenitor cells in the premetastatic lung promote 
metastases by inducing mesenchymal to epithelial transition. Cancer Res. 72, 
1384-1394 (2012). 

8. Lin, E. Y. et a/. Progression to malignancy in the polyoma middle T oncoprotein 
mouse breast cancer model provides a reliable model for human diseases. 
Am. J. Pathol. 163, 2113-2126 (2003). 

9. Guy, C. T,, Cardiff, R. D. & Muller, W. J. Induction of mammary tumors by 
expression of polyomavirus middle T oncogene: a transgenic mouse model for 
metastatic disease. Mol. Cell. Biol. 12, 954-961 (1992). 

10. Bhowmick, N. A. et a/. TGF-6 signaling in fibroblasts modulates the oncogenic 

potential of adjacent epithelia. Science 303, 848-851 (2004). 

11. Muzumdar, M. D., Tasic, B., Miyamichi, K., Li, L. & Luo, L. A global double- 

fluorescent Cre reporter mouse. Genesis 45, 593-605 (2007). 

12. Xue, C., Plieth, D., Venkov, C., Xu, C. & Neilson, E. G. The gatekeeper effect of 

epithelial-mesenchymal transition regulates the frequency of breast cancer 

metastasis. Cancer Res. 63, 3386-3394 (2003). 

13. Okada, H., Danoff, T. M., Kalluri, R. & Neilson, E. G. Early role of Fsp1 in 

epithelial-mesenchymal transformation. Am. J. Physiol. 273, F563-F574 

(1997). 

Gunasinghe, N. P., Wells, A., Thompson, E. W. & Hugo, H. J. Mesenchymal- 

epithelial transition (MET) as a mechanism for metastatic colonisation in 
breast cancer. Cancer Metastasis Rev. 31, 469-478 (2012). 

15. Cabezon, T. et al. Expression of S100A4 by a variety of cell types present in the 
tumor microenvironment of human breast cancer. Int. J. Cancer 121, 
1433-1444 (2007). 


14. 


476 | NATURE | VOL 527 | 26 NOVEMBER 2015 


16. Guy, C. T. et al. Expression of the neu protooncogene in the mammary 
epithelium of transgenic mice induces metastatic disease. Proc. Nat! Acad. 
Sci, USA 89, 10578-10582 (1992). 

17. Troeger, J. S. et a/. Deactivation of hepatic stellate cells during liver fibrosis 
resolution in mice. Gastroenterology 143, 1073-1083 (2012). 

18. Dumont, N. et a/. Sustained induction of epithelial to mesenchymal transition 
activates DNA methylation of genes silenced in basal-like breast cancers. Proc. 
Natl Acad. Sci. USA 105, 14867-14872 (2008). 

19. Park, S. M., Gaur, A. B., Lengyel, E. & Peter, M. E. The miR-200 family 
determines the epithelial phenotype of cancer cells by targeting the E-cadherin 
repressors ZEB1 and ZEB2. Genes Dev. 22, 894-907 (2008). 

20. Gregory, P. A. et al. The miR-200 family and miR-205 regulate epithelial to 
mesenchymal transition by targeting ZEB1 and SIP1. Nature Cell Biol. 10, 
593-601 (2008). 

21. Singh, A. & Settleman, J. EMT, cancer stem cells and drug resistance: an 
emerging axis of evil in the war on cancer. Oncogene 29, 4741-4751 (2010). 

22. Zhang, Y., Toy, K. A. & Kleer, C. G. Metaplastic breast carcinomas are enriched 
in markers of tumor-initiating cells and epithelial to mesenchymal transition. 
Mod. Pathol. 25, 178-184 (2012). 

23. Creighton, C. J. et al. Residual breast cancers after conventional therapy 
display mesenchymal as well as tumor-initiating features. Proc. Nat! Acad. 

Sci, USA 106, 13820-13825 (2009). 

24. von Minckwitz, G. Docetaxel/anthracycline combinations for breast cancer 
treatment. Expert Opin. Pharmacother. 8, 485-495 (2007). 

25. Lau, C. K. et al. An Akt/hypoxia-inducible factor-1a/platelet-derived growth 
factor-BB autocrine loop mediates hypoxia-induced chemoresistance in liver 
cancer cells and tumorigenic hepatic progenitor cells. Clin. Cancer Res. 15, 
3462-3471 (2009). 

26. Xiao, Z. M., Wang, X. Y. & Wang, A. M. Periostin induces chemoresistance in 
colon cancer cells through activation of the PI3K/Akt/survivin pathway. 
Biotechnol. Appl. Biochem. 62, 401-406 (2015). 

27. Yamada, D. et al. Role of crosstalk between interleukin-6 and transforming 
growth factor-beta 1 in epithelial-mesenchymal transition and 
chemoresistance in biliary tract cancer. Eur. J. Cancer 49, 1725-1740 (2013). 

28. Yao, Z. et al. TGF-B IL-6 axis mediates selective and adaptive mechanisms of 
resistance to molecular targeted therapy in lung cancer. Proc. Nat! Acad. 

Sci, USA 107, 15535-15540 (2010). 

29. Russo, J. E. & Hilton, J. Characterization of cytosolic aldehyde dehydrogenase 
from cyclophosphamide resistant L1210 cells. Cancer Res. 48, 2963-2968 
(1988). 

30. Tam, W. L. & Weinberg, R. A. The epigenetics of epithelial-mesenchyma 
plasticity in cancer. Nature Med. 19, 1438-1449 (2013). 

31. Trimboli, A. J. et al. Direct evidence for epithelial-mesenchymal transitions in 
breast cancer. Cancer Res. 68, 937-945 (2008). 

32. Yu, M. et al. Circulating breast tumor cells exhibit dynamic changes in epithelial 
and mesenchymal composition. Science 339, 580-584 (2013). 

33. Korpal, M. et a/. Direct targeting of Sec23a by miR-200s influences cancer cell 
secretome and promotes metastatic colonization. Nature Med. 17, 1101-1108 
(2011). 

34. Gupta, P. B. et al. Identification of selective inhibitors of cancer stem cells by 
high-throughput screening. Cel! 138, 645-659 (2009). 

35. Diessner, J. et al. Targeting of preexisting and induced breast cancer stem cells 
with trastuzumab and trastuzumab emtansine (T-DM1). Cell Death Dis. 5, 
e1149 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by a grant from the US 
Department of Defense CDMRP LCRP (LC110643). K.F. is supported by a 
fellowship from the NIH (1 F31 CA186510-01). V.M. was supported by the 
National Cancer Institute sub-award (U54 CA149196-05) and WCMC Meyer 
Cancer Center Pilot Funding. This work was also supported by funds from 

The Neuberger Berman Foundation Lung Cancer Research Center; the Arthur 
and Myra Mahon Donor-Advised Fund; the Liz Claiborne and Art Ortenberg 
Foundation; the Douglas & Katherine McCormick Family Foundation, the R. & M. 
Goldberg Family Foundation; the P. & C. Collins Fund; the Eliot Stewart ‘Wren’ 
Fund; the William and Shelby Modell Family Foundation Trust; and generous 
funds donated by patients in the Division of Thoracic Surgery to N.K.A. The 
funding organizations played no role in experimental design, data analysis or 
manuscript preparation. 


Author Contributions D.G., K.R.F. and V.M. designed the experiments. K.R.F. 

and D.G. performed the experiments. A.D., S.L., H.C., T-E.R. and S.R. provided 
technical support with experiments and animal work. L.T.V. and N.K.A. made 
critical comments to improve the study design. J.S., F.L. and S.T.C.W. performed 
RNA-sequencing analysis. J.T. and R.F.S. generated the Vim-Cre transgenic mice. 
K.R.F, D.G. and V.M. wrote and edited the manuscript with input from the other 
authors. All authors discussed the results and conclusions drawn from them. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to V.M. 
(vim2010@med.cornell.edu) or D.G. (dig2009@med.cornell.edu). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 

Animals. Wild-type C57BL/6 and FVB/n mice, and transgenic mice with ACTB- 
tdTomato-eGFP (stock no. 007676), Fsp1—Cre (stock no. 012641), MMT V-PyMT 
(stock no. 002374), and MMTV-Neu (stock no. 002376) were obtained from The 
Jackson Laboratory. The vimentin-CreER mouse was a kind gift from the labo- 
ratory of R. FE Schwabe at Columbia University. CB-17 SCID mice were obtained 
from Charles River Laboratories. All mouse strains obtained were bred in the 
animal facility at Weill Cornell Medical College. All animal work was conducted 
in accordance with a protocol approved by the Institutional Animal Care and Use 
Committee at Weill Cornell Medical College. 

The ACTB-tdTomato-EGFP and Fsp1-Cre mice were bred together 
to obtain double transgenic mice and then bred with MMTV-PyMT or 
MMTV-Neu mice to obtain the tri-PyMT and tri-Neu triple-transgenic mice, 
respectively. Double transgenic male mice carrying ACTB-tdTomato-eGFP 
and MMTV-PyMT were crossed with the vimentin-CreER mice to obtain 
the tri-PyMT/Vim triple-transgenic mice. Genotyping for each transgenic 
line was performed following the standardized protocols as described in the 
website of The Jackson Laboratory. Genotyping for vimentin-CreER was done 
using forward primer 5’-CCCCTTCCTCACTTCTTTCC and reverse primer 
5’-ATGTTTAGCTGGCCCAAATG. 

Tamoxifen injection. To induce vimentin—CreER activity in the tri-PyMT/Vim 
mice, Tamoxifen (Sigma-Aldrich, 2 mg per mouse, dissolved in corn oil) was 
administered through intraperitoneal injections, three times per week starting 
when the primary tumours appear (at 8 weeks of age) and continuing for 6 weeks 
until metastasis developed in the lung. 

Establishing tri-PyMT cell line. The primary tumour of the tri-PyMT mouse 
(12-week-old female) was surgically removed under sterile conditions. Tumour 
tissue was sliced into ~1 mm? blocks and implanted into the fat pad (no. 4 on the 
right side) of CB-17 SCID mice. The secondary tumour was used to establish the 
tri-PyMT cell line, eliminating the contamination of fluorescent positive stromal 
cells in the tumour tissue from tri-PyMT transgenic mice. 

Tumour tissue was minced and digested with an enzyme cocktail (Collagenase 
A, elastase, and DNase I, Roche Applied Science) in HBSS buffer at 37 °C for 
30 min. The cell suspension was strained through a 40-um cell strainer (BD 
Biosciences). Cells were washed with PBS three times and uploaded in the Aria 
III cell sorter (BD Biosciences). The sorted RFP* cells were cultured in DMEM 
supplemented with 10% fetal bovine serum. The PyMT oncogene expression in 
the established cell line was confirmed by RT-PCR (Extended Data Fig. 4c). The 
tumorigenic ability of these cells was confirmed throughout the study. 

To determine EMT induction by TGF-B, cells were cultured for one week in 
DMEM with 2% FBS and 2ngml~! TGF-81 (R&D Systems). The GFP* cell ratio 
was quantified by flow cytometry. 

To generate the miR-200 overexpressing cell line, a pLenti 4.1 Ex miR-200b- 
200a-429 construct”, was obtained from Addgene. To eliminate the contamination 
of fluorescent marker expression in targeted cells, the GFP gene in this construct 
was removed by BstBI/Xbal digestion followed by blunted self-ligation. Lentivirus 
was packaged by co-transfection of the pLenti-miR-200 construct and packag- 
ing plasmids into HEK293T cells. tri-PyMT cells (passage 2) were infected with 
the lentivirus. Infected cells (tri-Py MT miR-200) were selected by culturing with 
puromycin (2 ug ml) for 14 days. A control tri-PyMT cell line was generated by 
infecting cells with lentivirus carrying the puromycin resistance gene, following 
the same procedure in parallel. 

Orthotopic breast tumour model. To establish an orthotopic breast tumour 
model, we first purified RFP‘ cells from passages 10-15 of tri-PyMT cell culture by 
FACS. The purified RFP* tri-PyMT cells (1 x 10° cells with purity >99%, Extended 
Data Fig. 5a) were injected into the mammary fat pad of 8-week-old female CB-17 
SCID mice. The growth of the primary tumour was monitored by external calliper 
measurement once a week. In approximately 4 weeks, the primary tumour was 
surgically removed and the incision was closed with wound clips. The tumour 
size did not exceed 5% of total body weight as permitted in the IACUC protocol. 
Animals were euthanized 4 weeks after primary tumour removal to analyse the 
development of pulmonary metastasis. For animals subjected to chemotherapy, 
Cyclophosphamide (CTX, Sigma-Aldrich, 100 mgkg~') was administered once 
per week, for 2 weeks prior and 2 weeks after surgery. 

Tissue processing, immunofluorescence and microscopy. The harvested primary 
tumours and PBS-perfused lungs bearing metastases were fixed in 4% paraform- 
aldehyde overnight, followed by 30% sucrose for 2 days, and then embedded in 
Tissue-tek O.C.T. embedding compound (Electron Microscopy Sciences). Serial 
sections (10m, at least 10 sections) were prepared for histological analysis by 
haematoxylin and eosin staining, and immunofluorescent staining following 
standardized protocols. 

Primary antibodies used in this study include CD45 (30-F11, BioLegend), 
E-cadherin (DECMA-1, BioLegend), vimentin (sc-7557, Santa Cruz), PyYMT 


ARTICLE 


(ab15085, Abcam), Neu (sc-284, Santa Cruz), Ki67 (ab15580, Abcam), and active 
caspase-3 (C92-605, BD Pharmingen). Primary antibodies were directly conjugated 
to Alexa Fluor 647 using an antibody labelling kit (Invitrogen) performed as per 
manufacturer's instructions and purified over BioSpin P30 columns (Bio-Rad). 
GFP* and REP* cells were detected by inherent fluorescence. 

Fluorescent images were obtained using a computerized Zeiss fluorescent 
microscope (Axiovert 200M), fitted with an apotome and an HRM camera. Images 
were analysed using Axiovision 4.6 software (Carl Zeiss). 
Flow cytometry and cell sorting. For the metastatic lungs and primary tumours, 
cell suspensions were prepared by digesting tissues with an enzyme cocktail 
(collagenase A, elastase, and DNase I, Roche Applied Science) in HBSS buffer at 
37°C for 30 min. For cultured cells, cells were collected through trypsinization. A 
single-cell suspension was prepared by filtering through a 30-uum cell strainer (BD 
Biosciences). Then cells were stained following a standard immunostaining protocol. 
In brief, cells were pre-blocked with 2% FBS plus Fc block (CD16/CD32, 1:30, BD 
Biosciences) and then incubated with the primary antibody against E-cadherin 
(DECMA-1, BioLegend). SYTOX Blue (Invitrogen) was added to the staining tube 
in the last 5 min to facilitate the elimination of dead cells. GFP* and RFP* cells 
were detected by their intrinsic signals. The stained samples were analysed using 
the LSRII flow cytometer coupled with FACS Diva software (BD Biosciences). Flow 
cytometry analysis was performed using a variety of controls including isotype anti- 
bodies, unstained and single-colour stained samples for determining appropriate 
gates, voltages and compensations required in multivariate flow cytometry. 

For sorting live cells back for further culturing or injection into animals, we 
used the Aria II cell sorter coupled with FACS Diva software (BD Biosciences). The 
preparation of cells for sorting was performed under sterile conditions. The purity 
of subpopulations after sorting was confirmed by analysing post-sort samples in 
the sorter again. 

Quantitative RT-PCR analysis. Total RNA was extracted by using the RNeasy Kit 
(Qiagen), and miRNA via the mirVana miRNA isolation kit (Life Technologies), 
and converted to cDNA using qScript cDNA SuperMix (Quanta Biosciences) 
and RT-PCR. qPCR was performed with the appropriate primers (sequences 
shown in the table) and iQ™ SYBR Green master mix (Bio-Rad). PCR proto- 
col: initial denaturing at 95°C for 3 min, 40 cycles of 95°C for 20s, 60°C for 30s, 
and 72°C for 30s, followed by final extension at 72°C for 5 min and melt curve 
analysis was applied on a Bio-Rad CFX96 Real Time System (Bio-Rad) coupled 
with Bio-Rad-CFX Manager software. Primers used are as follows: GAPDH, 
forward, 5‘-GGTCCTCAGTGTAGCCCAAG-3’; reverse 5‘-AATGTGTC 
CGTCGTGGATCT-3’; Cdh1 (E-cadherin), forward, 5‘-ACACCGATGGTGAGGG 
TACACAGG-3’; reverse, 5’-GCCGCCACACACAGCATAGTCTC-3’; Ocln, 
forward, 5’-TGCTAAGGCAGTTTTGGCTAAGTCT-3’, reverse, 5/-AAAA 
ACAGTGGTGGGGAACGTG-3’; Vim, forward, 5’-TGACCTCTCTGAGG 
CTGCCAACC-3’; reverse, 5/-TTCCATCTCACGCATCTGGCGCTC-3’; Cdh2 
(N-cadherin), forward, 5’-AAAGAGCGCCAAGCCAAGCAGC-3’; reverse, 
5’-TGCGGATCGGACTGGGTACTGTG-3’; FSP-1, forward, 5’-CCTG 
TCCTGCATTGCCATGAT-3’, reverse, 5’/-CCCACTGGCAAACTACACCC-3’; 
Snail, forward, 5'/-ACTGGTGAGAAGCCATTCTCCT-3’; reverse, 5’-CTGGC 
ACTGGTATCTCTTCACA-3’; Snai2, forward, 5’-TTGCAGACAGATCA 
AACCTGAG-3’; reverse, 5‘-TGTTTATGCAGAAGCGACATTC-3’; Twist1, 
forward, 5/-AGCTACGCCTTCTCCGTCTG-3’; reverse, 5/-CTCCTTCT 
CTGGAAACAATGACA-3’; Zeb-1, forward, 5‘-GATTCCCCAAGTGGC 
ATATACA-3’; reverse, 5’-TGGAGACTCCTTCTGAGCTAGTG-3’; Zeb-2, 
forward, 5‘-TGGATCAGATGAGCTTCCTACC-3’; reverse, 5’/-AGCAA 
GTCTCCCTGAAATCCTT-3’; PyMT, forward, 5‘-ACTGCTACTGCA 
CCCAGACA-3’; reverse, 5/-CTGGAAGCCGGTTCCTCCTA-3’; GFP, 
forward, 5’-CCACATGAAGCAGCACGACT-3’; reverse, 5/-GGGTCTTG 
TAGTTGCCGTCG-3’; RFP, forward, 5‘-AGCGCGTGATGAACTTCGAG-3’; 
reverse, 5/-CCGCGCATCTTCACCTTGTA-3’. 

RNA-sequencing analysis. Total RNA was extracted from sorted RFP* and GFP* 
tri-PyMT cells with the RNeasy Kit (Qiagen). RNA-seq libraries was constructed 
and sequenced following standard protocols (Illumina). Single-end RNA-seq reads 
were mapped to UCSC mouse genome (GRCm38/mm10) using Tophat2. FPKM 
values for each gene were estimated by Cufflinks and statistical analysis was done 
using Cuffdiff2. Heat maps for differentially expressed genes with adjusted P values 
<0.05 were drawn using gplots R package. 

Western blot analysis. Cells were homogenized in 1 x RIPA lysis buffer (Millipore) 
with protease inhibitors (Roche Applied Science). Samples were boiled in 1x 
Laemmli buffer and 10% 6-mercaptoethanol, and loaded onto 12% gradient Tris- 
glycine gels (Bio-Rad). Western blotting was performed using antibodies specific 
for E-cadherin (clone DECMA-1), vimentin (clone RV202, BD Pharmingen), and 
B-actin (clone AC-15, Sigma-Aldrich). 

Cell apoptosis and viability assays. To determine apoptosis of RFP* and 
GFP* cells, tri-PyMT cells (Passage 10) were seeded on adherent six-well plates 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


(1 x 10° cells), and treated with 4-hydroperoxy cyclophosphamide (Santa Cruz) 
for 48 h. After treatment, cells were trypsinized and stained with APC-conjugated 
Annexin V (BD Biosciences) and SYTOX Blue (Invitrogen) for apoptotic-cell 
labelling. The stained cells were analysed in the LSRII flow cytometer to quantify 
the percentage of apoptotic, dead, and live RFP* and GFP* cells by FACS Diva 
software. To determine the viability of tri-PyMT control and miR-200-expressing 
cells treated with CTX, cells were plated in 96-well adherent black-walled plates 
(1 x 10‘ cells), and treated with 4-hydroperoxy cyclophosphamide for 48h. After 
treatment, cell viability was measured with the CellTiter-Glo Luminescent Cell 
Viability Assay (Promega). 

Cell migration assay. 1 x 10° tri-PyMT cells were seeded in a six-well plate. 
Real-time images of cells (including phase, GFP and RFP channels) were taken 
under a computerized Zeiss microscope (Axiovert observation) every 10 min for 
10h. Movement of individual cells (>10 RFP* and >10 GFP* cells in each field, 
>2 fields were analysed) were tracked with Image] software, and the distance that 
was travelled during that time was measured as indicated. 


ALDH activity assay. RFP* and GFP* tri-PyMT cells (1 x 10° cells each) were 
freshly sorted from culture by FACS and then homogenized in cold ALDH Assay 
buffer provided in the ALDH Activity Colorimetric Assay Kit (Biovision Inc.) 
Following the protocol, ALDH substrate and acetaldehyde were added. ALDH 
activities in samples were measured by OD at 450 nm in kinetic mode (every 3 min 
for 60 min). 

Statistical analysis. To determine the sample size of animal experiments, we used 


difference in means 


power analysis assuming > 2.5. Therefore, all animal experiments 


‘standard deviation | 
were conducted with >5 mice per group to ensure adequate power between groups 
by two-sample t-test comparison. Animals were randomized within each experi- 
mental group. No blinding was applied in performing experiments. Results are 
expressed as mean + s.e.m. Data distribution in groups and significance between 
different treatment groups was analysed by using the Mann-Whitney U-test in 
GraphPad Prism software. P values <0.05 were considered significant. Error bars 
depict s.e.m., except where indicated otherwise. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE | 


Tri-PyMT Primary Tumor 


5 4 * a ha 
z oT | 


CD45 


E-cad 


Extended Data Figure 1 | Characterization of the primary tumour and pseudo-colour. Representative images are shown (n > 5 mice). Note the 
lung metastasis of tri-PyMT mice. a, b, Sections of primary tumours co-localization of PyMT with RFP, and CD45 with GFP (as indicated by 
(a) and lungs (b) from tri-PyMT mice were immunostained for E-cadherin _ arrows), in both primary tumours and lung metastases. 

(E-cad, top), vimentin (Vim, middle) and CD45 (bottom) in white 


© 2015 Macmillan Publishers Limited. All rights reserved 


{eee ARTICLE 


Tri-Neu mouse (MMTV-Neu/FSP1-Cre/Rosa-RGFP) 


Primary Tumor Lung metastasis 
GFP/RFP/ CD45/GFPIL GFP/RFP/ CD45/GFP/I 
S le 4 a7 2° D - : a ms 


Neu/GFP/ Neu/GFP/ 
oh .o + Ok ~ tes, ne, 
= > h : * eae . a 4 at ‘. 

Smee ' a = : - . 4 r . 


GFP/R 


ae 


Vim/GFPIL 


Vim 


Extended Data Figure 2 | Characterization of the primary tumour and Neu, E-cadherin, and vimentin (in white pseudo-colour). Representative 
lung metastasis of tri-Neu mice. Sections of primary tumours (left panel) —_ images are shown (n > 5 mice). Note that both primary tumours and lung 
and lungs (right panel) from tri-Neu mice were immunostained for CD45, _ metastases are largely composed of epithelial RFP* tumour cells. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE Wass 


Tri-PyMT/Vim mouse (MMTV-PyMT/Vim-creER/Rosa-RGFP) 


Primary 
Tumor 


Lung 
Metastases 


Extended Data Figure 3 | Characterization of the primary tumour immunostained for PyMT, E-cadherin and vimentin (in white pseudo- 
and lung metastasis of tri-PyMT/Vim mice. Tri-PyMT/Vim mice were colour). Representative images are shown (n > 5 mice). Note that both 
obtained by crossing MMTV-PyMT, vimentin-Cre and Rosa26-RFP-GFP primary tumours and lung metastases are largely composed of epithelial 
transgenic mice. a, b, Sections of primary tumours (a) and lungs (b) were REP* tumour cells. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


d Tri-PyMT cells 
a 
25.0 - 
20.0 
2 
8 
+ 15.0 
a 
Ww 
2 10.0 
° 
ss z 
5.0 
0.0 % 
Cont +TGF 
+ 2% FBS Ks 
1 ¢ Tne 
RFP 
b RFP+ cells GFP+ cells 
€ 
300 300 | = 200 
c * 
2 
% 150 
4 
D8 
£ 
= 100 
s 
-300 150 300 -300 ss 
(um) 2 50 
12) 
¢ 
8g 
a2) 0 
-300 -300 - a RFP+ GFP+ 
c 
5.0 5 
4.0 - 
c 
2 30. = RFP 
o 
a 20° = GFP 
> 1.04 
5 op LL i Hil 
(U) 0.0 T T T T T T T T T T T T T T 1 
oS > NO WW a oh NS SK XR & 
rd o RS & Rd & & RS ae ae P ss & & 
Epithelial Markers Mesenchymal Markers Tumor Markers 


Extended Data Figure 4 | Characterization of tri-PyMT cells. a, EMT 

of tri-PyMT cells with TGF. RFP* tri-PyMT cells were sorted by flow 
cytometry and cultured in medium containing 2% FBS with or without 
TGF-f 1 (2ng ml’) for 3 days. Plot shows quantification of the percentage 
of GFP* cells analysed by flow cytometry (n = 2 biological replicates). 

b, Cell migration assay of tri-PyMT cells. The tracing plots show the 
movement of individual RFP* and GFP* cells in 10h of live imaging. 
Quantification plot (right panel) showed the average distance that RFP* 
and GFP* cells have moved during the time frame (n > 20, *P< 0.01). 

c, Relative expression of epithelial, mesenchymal and tumour markers in 


sorted RFP* and GFP* tri-PyMT cells as determined by qRT-PCR with 
Gapdh as the internal control. n = 2 individual experiments. 

d, EMT of tri-PyMT cells is reported by fluorescent marker switch. Flow 
cytometry plot shows E-cadherin™ (E-cad~) and E-cadherin* (E-cad*) 
subpopulations of tri-PyMT cells (upper panel). Of the E-cad~ and 
E-cad* subsets, the populations were further dissected according to innate 
fluorescence (lower panel). Numbers indicate the percentage of GFP*, 
RFP‘, or transitioning (Q2) cells in the parental E-cad~ or E-cad* subsets, 
respectively. 


© 2015 Macmillan Publishers Limited. All rights reserved 


a Tri-PyMT cells in culture 
Post-sorting 


lorP 


Pre-sorting 


GFP 


Sorted 


Orthotopic 


RFP+ cells 


ARTICLE 


PT removal 


injection 
= ; ce eu 4 wks v 4 wks 


) 
Suey 


a 


d 
4 
L 
3 3 
+ 
- 2 
oO 
6 4 
a 
0 


PT 


Extended Data Figure 5 | Establishing an orthotopic model with sorted 
RFP* tri-PyMT cells. a, Flow cytometry plots show tri-PyMT cells before 
and after sorting for RFP* cells. Numbers indicate the percentage and 
purity of RFP* cells used for establishing orthotopic breast tumours in 
mice. b, Schematic of the orthotopic breast tumour model with sorted 
RFP* tri-PyMT cells. Cells are injected into the mammary gland of 
wild-type mice to generate primary breast tumours, resection of primary 
tumour at 4 weeks and lung metastases evaluation in another 4 weeks. 

c, Characterization of tumour cells in the primary tumour, disseminated 


tumour cells (DTCs) and tumour cells in the lung metastasis of the 
tri-PyMT orthotopic model. Sections of primary tumours and lungs 
from tri-PyMT orthotopic mice were immunostained for E-cadherin and 
vimentin (in white pseudo-colour). Essentially all RFP* tumour cells 

are detected as E-cad*/Vim_, while the scattered GEP* tumour cells 

in the primary tumour are E-cad~/Vim* (as indicated by arrows in the 
top panel). Representative images are shown (n = 8). d, Plot shows the 
percentage of GFP* cells out of total tumour cells (GFP* plus 

RFPt, n=6). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Gm ARTICLE 


es GFP/RFP 


Merge 

J fi x a . 4 
j a 
Ecad 


Metastases 


Extended Data Figure 6 | Characterization of EMT status of 
orthotopic tri- Vim-PyMT primary tumours. a, b, Sections of 

tri- Vim-PyMT orthotopic primary tumours (a) and metastatic lung 
(b) were immunostained for E-cadherin and vimentin (in white 


pseudo-colour). As expected, RFP* tumour cells are entirely E-cadherin- 
positive and vimentin-negative, GFP* tumour cells are vimentin-positive 
and E-cadherin-negative, and lung metastases are epithelial and RFP*. 


© 2015 Macmillan Publishers Limited. All rights reserved 


RFP 


ARTICLE 


rene J sens 


Mouse 1 
Mouse 2 
Mouse 3 


Mouse 4 


12.0 
10.0 
8.0 
6.0 
4.0 


1.0 


Gene Expression 


0.0 


¢ 


Epithelial 
Markers 


Extended Data Figure 7 | Dissemination of tri-PyMT cells in vivo. 

a, Disseminated tumour cells are RFP* and epithelial. RFP* tri-PyMT cells 
were injected into the fat pad of mice. The fluorescence of the primary 
tumour, circulating tumour cells in the blood and disseminated tumour 
cells in the lung were analysed by flow cytometry. The flow cytometry 
plots depicted are the enumeration of RFP*+ and GFP* cells. b, The ratios 


\ 
& oo” 


19:1 
181: 1 12:1 
84:1 11:1 
120 :1 Seen 
Cc 
6.07 4miR200a 
6 BmiR200b 
2 4.0 m= miR429 
oO 
a. 
3 
® 2.0 
5 
(0) 


0.0 


Cont +miR-200 


Rs & oe NS xv Oo re 
\S SY ~ 
CLS KM SF 


Mesenchymal Markers 


© 2015 Macmillan Publishers Limited. All rights reserved 


ese 4 


286: 1 
296 :0 
45:1 
110: 1 


i Cont 
Hi +miR-200 


K 2 << 
XS & Ld 
gy SO OS 
Tumor 


Markers 


of detected RFP* versus GFP* cells are shown in the chart (n=4 mice). 

c, Relative expression of miR-200-family microRNAs in tri-PyMT control 
and miR-200-expressing cells. n = 2 individual experiments. d, Relative 
expression of EMT markers and tumour markers in tri-PyMT control and 
mir-200-expressing cells as determined by qRT-PCR with Gapdh as the 
internal control, n =2 individual experiments. 


ARTICLE 


a b c 
Primary tumor Proliferation Apoptosis 
43.5 25.0 5 10.0 - 
5 3.0 i = z 
— 0 20.0 + | & 8.0 + 
= 2.5 Ms + 
52.0 rs 15.0 | 8 6.0 + 
(e} bd ” 
21.5 <10.0 + 6 4.0 - 
5 1.0 3 ‘- 
50s 5.0 + 3 20) ms 
0.0 ] 0.0 + 0.0 = 
Cont +CTX Cont +CTX Cont +CTX 
d Ki67 e Active Casp-3 
Control Control +CTX 
— GFP/RFP/DAPIKi67 GFP/RFP/DAPI/Active 
f ‘ . 
Proliferation g Apoptosis 
25 8 5 25 
n 
x) a 
3 20 g 6 4 2 1.5 - 
+ 15 & : 
= 2 4. = GFP a 1 + 
< 10 5 =RFP © 
° 5 5 2° 7 0.5 - 
= se se 
0 0 0 
Cont +CTX Cont +CTX Cont +CTX 


Extended Data Figure 8 | Effects of CTX therapy on primary tumours. 
a, Quantification of primary tumour growth after 2 weeks of CTX therapy. 
For tumour growth data see accompanying Source Data. b, Proliferation 
status of primary tumour cells as detected by Ki67 staining in control 
mice and after 2 weeks of CTX therapy. c, Level of apoptosis in primary 
tumours as detected by active caspase-3 staining in control mice and 

after 2 weeks of CTX therapy. d, e, Representative images of Ki67 (d) and 
active caspase-3 staining (e) (white pseudo-colour) of primary tumours 


in control mice and CTX-treated mice. Scale bars, 50 um. f, Proliferation 
status of REP* and GFP* primary tumour cells as detected by Ki67 
staining in control and CTX-treated mice. g, Level of apoptosis in RFP* 
and GFP* primary tumours as detected by active caspase-3 staining in 
control and CTX-treated mice. h, Percentage of GFP* tumour cells in 
control and CTX-treated primary tumours. n = 3 mice for all figures 
described above. Quantification performed using Image] software. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Cont 


+CTX 


% of GFP+ cells 


GFP/REP/DAPI 


Vim 


Extended Data Figure 9 | EMT tumour cells are resistant to CTX 
treatment both in vitro and in vivo. a, b, Long-term CTX treatment 
in vitro results in a GFP* population. Tri-PyMT cells were subjected to 
2 weeks cyclophosphamide (+ CTX) treatment (41M). Fluorescent 
imaging (a) and flow cytometry (quantified, b, n =3) exhibit the 
percentage of GFP* cells in the CTX-treated culture compared to 
untreated control cells c, d, EMT status of lung nodules in competitive 
survival assay. Representative fluorescent images of tri-PyMT lung 


ARTICLE 


© 

> 

° 
1 


CTX treated lung 


GFP/RFP/DAPI E-cad/DAPI 


Vim/DAPI 


50 ym 


metastases in untreated control lungs (c) and CTX-treated lungs (d), 
depicting RFP* and GFP* tumour cells. Immunostaining showing 
E-cadherin (E-cad) or vimentin (Vim) in white pseudo-colour. White 
arrow indicates GFP* tumour cells with epithelial phenotypes (E-cad*/ 
Vim_ ), while the yellow arrow indicates GFP* cells with mesenchymal 
phenotypes (E-cad~/Vim™). Nuclei were counter-stained with DAPI. 
n=): 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a ss RFP+ GFP+ b RFEP+ GFP+ 
Tfcp2l1 116 
F ; Cdk18 Igfbp2 
Epithelial eeuaa ne 
Ran Cxcl12 
Mcm2 Krt5 
Mesenchymal Mcm3 Pdgfra 
Color Key Mcm4 Postn 
=? Mcm5 Pdgfrb 
Be Mcm6 Tle2 
ae Sees Color Key Prix 
Row Z-Score B4galt5 
c +CTX 
RFEP+ GFP+ GFP+ Row Z-Score 
1| SS Drug transporters 
d ALDH activity assa 
Phase I drug ¥ ” 
2 metabolizing 0.05 
enzymes 0.04 
© 0.03 
¢ —— RFP 
Q 
oe —=— GFP 
Phase Il drug 0.01 
metabolizing 
enzymes 0.00 
? y 0 15 30 45 
Time (min) 
e 
2 
O 
° 
xe) 
3 = RFP+ 
a 
— mu GFP+ 
°o 
3S 


CTX 


Extended Data Figure 10 | Gene expression profile analysis of RFP* and 
GFP* tri-PyMT cells. RFP* and GFP* tri-PyMT cells were sorted by flow 
cytometry and subjected to transcriptomic analysis by RNA-sequencing. 

a, Heat map of differentially expressed genes (adjusted P < 0.05) from 
RNA-seq of sorted RFP* and GFP* tri-PyMT cells, biologically duplicated. 
Genes that are established epithelial markers (Group 1) include Cdh1 
(which encodes E-cad), Dsp, Epcam, Fgfbp1, Krt18, Krt19, Ocln, Tjp3, 
Krt14 and Tjp2; the mesenchymal markers (Group 2) include Cdh2 (which 
encodes N-cad), Col23a1, Col3a1, Col5a1, Col6a2, Fsp1, Mmp3, Wnt5a and 
Zeb1.b, Cell cycle (left panel) and chemoresistance-related (right panel) 
genes alternatively regulated in RFP* and GFP* cells. c, GFP* tri-PyMT 
cells were also sorted from CTX-treated (41M) samples. Interestingly, a 
branch of genes related to drug metabolism were significantly elevated in 
CTX-treated GFP* cells. Group 1 genes are drug transporters including 
Abcb1la, Abcb1b and Abcc1. Group 2 genes are phase I drug-metabolizing 
enzymes including Adh7, Aldh1a1, Aldh1a3, Aldhil1, Aldh112, Aldh2, 


Cont 


Dox 


Taxol 5FU 


Aldh3a1, Aldh3a2, Aldh3b2, Aldh4al, Cyp1lal, Cyp2f2, Cyp2j6, Ptgs1 and 
Ptgs2. Group 3 genes are phase II drug metabolizing enzymes including 
Aox1, Blvrb, Ces2e, Ces2f, Ces2g, Chst1, Ephx1, Fmo1, Gpx2, Gsta3, Gsta4, 
Gstm2, Gstol, Gstp1, Gstt3, Maoa, Mgst1, Mgst2, Nat6, Nat9, Nqo1, Pon3, 
Ugtla6a and Ugtia7c. d, Aldehyde dehydrogenase (ALDH) activity assay. 
Cell lysates were prepared from flow cytometry-sorted RFP* and GFP* 
tri-PyMT cells. ALDH activity in samples was measured by OD at 450nm 
in a kinetic mode (every 3 min for 60 min). Representative result from 

two independent experiments depicted. e, EMT tumour cells (GFP* cells) 
showed resistance to multiple commonly used chemotherapies. Tri-PyMT 
cells were subjected to treatment with CTX (8 uM), doxorubicin (Dox, 
2M), paclitaxel (Taxol, 10 uM) and fluorouracil (5FU; 1.6 uM) for 3 days. 
Flow cytometry analysis of apoptotic cells was performed after Annexin 
staining. The percentage of dead cells (Annexin*) in RFP* and GFP* cells, 
respectively, was quantified. n = 2 biological replicates. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature15699 


Allosteric ligands for the pharmacologically 


dark receptors GPR68 and GPR65 


Xi-Ping Huang! *, Joel Karpiak**, Wesley K. Kroeze!*, Hu Zhu!+, Xin Chen*>+, Sheryl S. Moy®, Kara A. Saddoris°+, 
Viktoriya D. Nikolova®, Martilias S. Farrell'+, Sheng Wang!, Thomas J. Mangano!, Deepak A. Deshpande’, Alice Jiang!?*, 
Raymond B. Penn’, Jian Jin*+-°+, Beverly H. Koller®, Terry Kenakin!, Brian K. Shoichet? & Bryan L. Roth!?5 


At least 120 non-olfactory G-protein-coupled receptors in the human genome are ‘orphans’ for which endogenous 
ligands are unknown, and many have no selective ligands, hindering the determination of their biological functions 
and clinical relevance. Among these is GPR68, a proton receptor that lacks small molecule modulators for probing its 
biology. Using yeast-based screens against GPR68, here we identify the benzodiazepine drug lorazepam as a non-selective 
GPR68 positive allosteric modulator. More than 3,000 GPR68 homology models were refined to recognize lorazepam ina 
putative allosteric site. Docking 3.1 million molecules predicted new GPR68 modulators, many of which were confirmed 
in functional assays. One potent GPR68 modulator, ogerin, suppressed recall in fear conditioning in wild-type but 
not in GPR68-knockout mice. The same approach led to the discovery of allosteric agonists and negative allosteric 
modulators for GPR65. Combining physical and structure-based screening may be broadly useful for ligand discovery 


for understudied and orphan GPCRs. 


G-protein-coupled receptors (GPCRs)—the largest family of proteins 
encoded in the human genome—transduce signals for the most diverse 
endogenous ligands of any receptor family. Correspondingly, GPCRs 
are the most productive drug targets, with over 26% of US Food and 
Drug Administration (FDA)-approved drugs acting primarily through 
them. Astonishingly, of the 356 non-olfactory GPCRs, about 38% are 
understudied or ‘orphan’ receptors whose physiological roles, and 
often endogenous ligands, remain unknown’. Given the central role 
of GPCRs in physiology and disease, and the high conservation of 
orphan GPCRs among organisms from worms to humans, under- 
studied and orphan GPCRs are probably functionally and therapeuti- 
cally important. Indeed, for the few GPCRs deorphanized since 2003 
(refs 1, 2 and http://www.guidetopharmacology.org/GRAC/Family 
DisplayForward?familyld=16), most have newly approved and investi- 
gational drugs’. As with kinases’, epigenetic proteins” and proteases’, 
ligands specific for orphan GPCRs will illuminate their biology and 
provide new areas for therapeutic intervention. 

A key impediment to GPCR deorphanization is uncertainty about 
the proteins through which they signal, making functional assays 
problematic’. This difficulty is increased by the diverse ligands that 
GPCRs recognize, which range from protons and photons, small 
neurotransmitters and lipids, to peptides and folded proteins. Thus, 
generic functional screens are difficult for orphan GPCRs—one 
neither knows what class of compounds to screen, nor how to screen 
for it, much less how to demonstrate relevance—thereby explain- 
ing the slow progress in determining their roles in signalling and 


physiology’. 


GPR68 (also known as OGR1) exemplifies both the important roles 
these understudied and orphan receptors are thought to serve, and 
our difficulties in illuminating them. Together with GPR4, GPR65 
and GPR132, GPR68 belongs to a family of proton-sensing GPCRs’. 
GPR68 couples to several signalling pathways through Gg, Gs, Gy2/13 
or Gijo proteins’-'°. GPR68 is expressed in many tissues and has 
been implicated in many processes'!~'®, but it is most abundant in 
mouse cerebellum!’ and hippocampus!! (http://www.brain-map. 
org/), suggesting yet to be identified roles in brain function. In acidic 
microenvironments, GPR68 seems to regulate inflammatory pro- 
cesses in airway smooth muscle and other cells'*”°. Surprisingly, 
studies with GPR68-knockout mice uncovered only modest changes 
in these functions!®?!”, Although GPR68 has been reported to be 
activated by a family of isoxazoles'°, their weak activity seems to 
be nonspecific?** and could not be reproduced (see later). Thus, 
although GPR68 may have many roles, few of them are well- 
characterized by knockout and none is known in the central nervous 
system (CNS), where it is most highly expressed. Like other targets 
lacking small molecule reagents, GPR68 remains ‘pharmacological 
dark matter’. 

Here we describe an integrated experimental and computational 
approach to discover ligands that modulate GPR68. A lead compound 
that functions as a positive allosteric modulator (PAM) is demon- 
strated in vitro and in vivo, providing insights into GPR68 physiology. 
Application of the same approach found allosteric agonists and nega- 
tive allosteric modulators for a second understudied GPCR, GPR65, 
suggesting that the approach may be broadly useful. 


1Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27599-7365, USA. 2National Institute of Mental Health Psychoactive Drug Screening Program 
(NIMH PDSP), School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7365, USA. *Department of Pharmaceutical Chemistry, University of California 

at San Francisco, Byers Hall, 1700 4th Street, San Francisco, California 94158-2550, USA. 4Center for Integrative Chemical Biology and Drug Discovery (CICBDD), University of North Carolina 

at Chapel Hill, Chapel Hill, North Carolina 27599-7363, USA. 5Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, 
Chapel Hill, North Carolina 27599-7360, USA. Department of Psychiatry and Carolina Institute for Developmental Disabilities (CIDD), University of North Carolina at Chapel Hill, Chapel Hill, 
North Carolina 27599-7146, USA. ’Center for Translational Medicine and Department of Medicine, Thomas Jefferson University, Philadelphia, Pennsylvania 19107, USA. ®Department of Genetics, 
School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7264, USA. +Present addresses: National Center for Advancing Translational Sciences (NCATs), 
9800 Medical Center Drive, Rockville, Maryland 20850, USA (H.Z.); Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7365, USA (X.C.); 
Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, Colorado 80309, USA (K.A.S.); Department of Genetics, School of Medicine, University of North Carolina at 
Chapel Hill, Chapel Hill, North Carolina 27514, USA (M.S.F.); Touro College of Osteopathic Medicine, 60 Prospect Avenue, Middletown, New York 10940, USA (A.J.); Department of Structural and 
Chemical Biology, Department of Oncological Sciences, Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA (J.J.). 


*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 477 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 300 : b 6.07 Lorazepam 
eG Toremifine ¥ Lorazepam = Desmethyldiazepam 
250 eG v -* Diazepam 
v @PRea 5.07 -* Flunitrazepam 
= A + 4-Chlorodiazepam 
Ss N00 SeeSere asad aeeeeoen eet Se eeeee ss ‘I 
& 8 
8 r) 
S 1501 y v 2 
E 3 
Pad v £ 
B Vv & A 
= e 
= 100 v 8 v 3 
5 x 
1o} 


T T T T 
QO 50 100 150 200 250 300 350 400 450 


Compound number 


Basal -8 -7 -6 -5 -4 
log[drug] (M) 


c acy een cath d Lorazepam (uM) 


2e0 
oh a 40, = 0.03 
cr = cr oF * 0.1 
* 0.3 
Active Active 307 +1 
e3 
210 
*& 30 


Diazepam‘ Flunitrazepam = 4-Chlorodiazepam 


RLU (fold of basal) 
nN 
i=} 


“Hoc i) /; 
3) fe) 

“He 9 ~ N 

N 4 "ep O 10 Lf h/, 

fs cl =N 3 r 

we - 
0 
: QO 
F 
QO EY 4 -8.5 -8.0 -75 -70 -65 -6.0 


Less active Less active Inactive 


Figure 1 | Lorazepam is a GPR68-positive allosteric modulator. 

a, A library of approved drugs (10 uM) screened with yeast expressing 
chimaeric G,, Gg or GPR68 and chimaeric G, (GPR68,) revealed lorazepam 
as a true and toremifine as a false positive. b, Concentration-dependent 
stimulation of GPR68 G,-yeast growth by lorazepam and analogues. 

c, Structures of representative benzodiazepines (arrows denote methyl 
substituents that reduce GPR68 activity). d, Lorazepam is a GPR68-positive 
allosteric modulator for the agonist proton in the GPR68-mediated cAMP 
production. RLU, relative luminescence units. Data are mean + s.e.m. of 
normalized results (a, b, d, n = 3) and concentration-response curves 

(b, d) were fit via a four-parameter logistic function (see Methods). 


Yeast -based screen reveals GPR68 active compounds 
In an initial campaign with 24 selected orphan and understudied 
GPCRs, we modified a yeast assay system and screened a small 


Align GPR68 family 
sequences to CXCR4 


(~29% identity) homology 


models 


Glu160' @ 
Arg189 
et 01 


screen 
~3.1 million 


His269 molecules 


Figure 2 | Virtual screening workflow and predicted location of 
GPR68 allosteric site. a, Sequence alignment of GPR68, GPR4, GPR65 
and GPR132 to CXCR4 (details in Extended Data Fig. 2e). b, Docking 

of lorazepam and NCC library to five distinct binding sites (details in 
Extended Data Fig. 2f). c, Models evaluated by their favourable ranking 
of lorazepam versus decoy molecules. d, Optimizing the most favourable 
lorazepam binding mode. e, Optimized lorazepam orientation (grey stick) 


478 | NATURE | VOL 527 | 26 NOVEMBER 2015 


faa ¥ i) 
am & oe 
N 
cl 
—==. fot mace 


Sample 


potential 
3,307 diverse lorazepam 


binding 
modes 


using DOCK 
Re-dock into d 
optimized 
binding sites £ ( ae 


ZINC lead-like 
f : R F 
Ne 


library of approved drugs (http://www.nihclinicalcollection.com/ and 
Supplementary Fig. 1). We confirmed the known activity of short- 
chain carboxylic acids on the GPR41 and the GPR43 free fatty acid 
receptors (Extended Data Fig. la—d), and that of zinc (Extended Data 
Fig. le) and several other metals (Extended Data Fig. 1f-k) at GPR39. 
The most notable result was the finding that the benzodiazepine 
anxiolytic lorazepam was an agonist at GPR68 (Fig. 1). 

Lorazepam activated GPR68 signalling, stimulating yeast growth by 
more than twofold (Fig. 1a). N-unsubstituted benzodiazepines were 
more efficacious than N-substituted benzodiazepines (Fig. 1b, c and 
Supplementary Table 1) and activated the receptor at both pH 6.5 and 
7.4 (Extended Data Fig. 11), with lorazepam most potently shifting 
the H* concentration-response profile (Fig. 1d and Extended Data 
Fig. 1m-p). The pH-dependence of lorazepam activity suggested that 
it functions as a PAM of GPR68; lorazepam did not affect the activity 
of the related receptors GPR4 or GPR65 (Extended Data Fig. 2a, b). 
When profiled against a panel of CNS targets, lorazepam had substan- 
tial activity only at the GABA, (y-aminobutyric acid type A) receptor, 
its therapeutic target (Extended Data Fig. 3). 


Modelling the GPR68-lorazepam complex 

Little improvement in activity or selectivity was achieved by testing 
lorazepam analogues. This observation, and the potent GABA, 
receptor activity of the drug, led us to seek specific, optimizable mol- 
ecules from computational docking screens of multi-million molecule 
libraries (Fig. 2). 

We generated 407 homology 3D models for GPR68 templated on 
the CXCR4 structure (29% sequence identity, Extended Data Fig. 2f), 
and these were expanded by another 2,900 models using elastic net- 
work modelling, which sampled backbone and loop conformations. 
Against each of the 3,307 models, we computationally docked the 
active benzodiazepines, more than 440 inactive compounds from the 
National Clinical Collection (NCC; http://nihsmr.evotec.com/evotec/ 
sets) library, and 176 property-matched decoy molecules”*. In each 
model, five candidate allosteric sites were docked against (Extended 
Data Fig. 2g), based on the binding regions of aminergic GPCRs, the 
peptide and antagonist sites of CXCR4, and the muscarinic receptor 
allosteric site. Iterative cycles of modelling and optimization (Fig. 2b-e) 
attempted to capture two aspects of ligand binding. First, the activity of 


— 
“a +] Choose final 
ft o, = © lorazepam with “ lorazepam- 


y Ge inactive NCC as binding mode 


ae 
or 


site around docked 


Soe decoys 
Optimize binding 
pose using PLOP 


Optimize 
scaffold 
for probe 


cia ZINC67740571 
activity 


(ogerin) 


in GPR68 (cyan ribbon) and M) muscarinic receptor (salmon ribbon; 
Protein Data Bank (PDB) code 4MQT) with allosteric site (grey) and 
orthosteric site (quinuclidinyl benzilate, magenta). f, Lorazepam in its 
predicted orientation and interactions. g, Virtual screen of ZINC subset 
(~3.1 million molecules) to identify predicted hits. h, ZINC67740571 
(magenta stick) in its predicted orientation and interactions in GPR68. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Buffer 
Lorazepam 
67740571 P 
32590454 H 
32581032 
82547799 < HN N N 
32520276 ae i 
32503371 | (t1-4) 
21367567 Nx UN 
21367544 
‘< | 20869006 
© | 20855260 H 
%B | 20855205 
S| 20819788 HON, NN Ne) OH 
| 20601562 ig Tt 
§| 20213042 Ne 
20213028 
6785875 
5933520 i 
peoonae én Linker Name 
4928929 ‘ 
a 4928902 1 Ogerin 
a 4909980 2 C2 
2541525 
Zz 4718514 mm pH 6.50 3 C3 
N 625739 
#9270 mm pH 7.40 H 4 C4 
40512413 al 
40066704 
32939481 N N 
32587282 a 7 
24748979 
| 23135897 
G | 22096188 
@} 20826836 
OQ] 20729152 Fr F 
| 18196037 
| 17946127 H 
15080047 N 
12558970 S 
6258265 \ ) 
4929116 < HN 
525649 
222801 = 
35478261 
ES 2497882 ZA 
= 2497868 205 
i ——= : oN 
0.0 1.0 2.0 3.0 4.0 5.0 6.0 “ 
Relative activity (fold of basal) 
b c d e 
2001 Ogerin (uM) 200 200) C3 (uM) 200) C4 (uM) 
=e 20 20 
o #03 #03 
8 150 150 150} «14 150 «1 
2 +3 +3 
fe) +10 +10 
3 100 100 100; #230 100 
2 
> 50 50 50 50 Lp 
x 
0 0 10} ie} 
9 -8 -7 -6 -9 -8 —7 -6 -9 -8 -7 -6 -9 -8 —7 -6 
log[H*] log[H*] log[H*] log[H*] 


Figure 3 | Identification, characterization and optimization of GPR68- 
positive allosteric modulators. a, Normalized results of GPR68-mediated 
cAMP production for selected compounds (ZINC database numbers) are 
shown; data represent mean + s.e.m. (n= 4-34 measurements) at 10 uM 
for pH 7.40 and 6.50. Compounds were grouped into a first batch from the 
first round of virtual docking, and a second batch from the second round 
of docking. Compounds labelled Isx are isoxazole analogues. 

Lead compounds ZINC32587282, ZINC4929116, ZINC67740571 (ogerin), 


the benzodiazepines as PAMs, and second, the role of histidine residues 
17, 84, 169 and 269, which are thought to interact with one another in 
the inactive state, and move apart on protonation at lower pH values’. 
This cycle converged to a stable lorazepam docking pose (Fig. 2f), and 
to its ranking first among the 622 decoy molecules. This strategy resem- 
bles previous ligand-guided docking”’~*’, although here the binding site 
was unknown. In its docked geometry, lorazepam hydrogen bonds with 
Glu160, Arg189, Tyr244 and Tyr268, and forms non-polar contacts 
with Trp77, Leul01, Phe173, His269 and Leu272 (Fig. 2f). 

To test the modelled lorazepam site, we mutated the Glu160, Arg189 
and His269 residues lining the site (Fig. 2fand Extended Data Fig. 2e, f), 
and determined their roles in proton-mediated cAMP production and 
calcium release (Extended Data Fig. 4). The His269Phe mutant right- 
shifted proton concentration-response curves in both assays’, while sub- 
stitutions at Arg189 selectively abolished cAMP production. Different 
substitutions at Glu160 had varying effects at downstream signalling 
pathways—Glu160Ala left-shifted the proton concentration-response 
curve and reduced cAMP production, but was inactive in calcium 
release, while the Glul60Lys and Glu160GIn mutations had modest 
effects in both pathways (the mutants had little effect on expression, 
Extended Data Fig. 4c). These substantial and differential effects on 
downstream coupling support a role for these residues in the functions of 
GPR68, and are consistent with the modelled binding site for lorazepam. 


its isomer (ZINC32547799) and analogues (C2, C3 and C4) with different 
lengths of linkers, are highlighted. b-e, Concentration—-response curves 
of normalized data (mean + s.e.m.; n = 4) for ogerin (b), C2 (c), C3 (d) 
and C4 (e) are shown to illustrate the allosteric potentiation of proton 
and analysed using a standard operational allosteric model. Allosteric 
parameters are summarized in Supplementary Table 8, and curve-fitting 
details are in Methods. 


Seeking optimized PAMs, we computationally docked 3.1 million 
available lead-like molecules against the putative lorazepam site in 
GPR68. Overall, more than 3.3 trillion complexes were calculated and 
scored. From among the top 0.1% of the docking-ranked molecules, 17 
were purchased for testing; along with their high docking ranks, these 
compounds recapitulated key interactions made by lorazepam in its 
docked model, were chemically diverse and had high-scoring analogues 
(Supplementary Table 2). 

Four of the docking hits increased cAMP production by about 
1.5-fold over basal at pH 6.5 (Fig. 3a). Although none was as active as 
lorazepam, two compounds, ZINC4929116 and ZINC32587282, had 
hundreds of available analogues. These were docked against the GPR68 
model, and 25 were chosen for testing (Fig. 3a and Supplementary 
Table 3). Thirteen had greater activity than lorazepam, and their 
pH-dependent potentiation activity clearly indicates allostery. Although 
dissimilar, lorazepam and ZINC67740571 dock to form many of the 
same interactions, with the addition of a new predicted hydrogen-bond 
to Glu160 from the hydroxyl of ZINC67740571 (Fig. 2f, h and Extended 
Data Fig. 2h). 


Ogerin as a selective GPR68 PAM 
Ten selected compounds were studied further in functional assays. 


According to the standard allosteric operational model”, all were 


26 NOVEMBER 2015 | VOL 527 | NATURE | 479 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a G, pathway (cAMP production) b 
150 2.0 


G, pathway (calcium release) 


© GPR68 
@ GPR68 + 32547799 
® GPR68 + ogerin 


1.87 © GPR68 
@® GPR68 + 32547799 
1.6 7 @ GPR68 + ogerin 


1.2 
1.0 


8.5 -80 -75 -7.0 -65 -6.0 -5.5 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 
log[H*] log[H*] 


f=} 
i=} 


a 
o 


RLU (fold of basal) 
RFU (fold of basal) 


0 


i) 
[5 


Context Cue 


* == 


f=} 
is} 
f=) 
is} 


a 
i=} 
a 
i=} 


Cue freezing (relative %) 
(normalized to vehicle control) 


Contextual freezing (relative %) 
(normalized to vehicle control) 


Figure 4 | Ogerin modulates signalling and memory. a, b, Ogerin and 
ZINC32547799 (10 uM) modulate proton-mediated cAMP production 

(a, n=4) and calcium mobilization (b, n=5). Data ina and b are 

mean + s.e.m. RFU, relative fluorescence units. c, d, Ogerin but not its 
isomer (ZINC32547799) decreased contextual memory retrieval in wild- 
type (WT; n=7) but not GPR68-knockout (KO; n= 8) C57BL/6J male mice 
(c, Fa.27)=4.71, P< 0.05 for drug x genotype effect, P< 0.05 for ogerin at 
wild-type mice, two-way analysis of variance (ANOVA), Bonferroni’s 
post-hoc test); both had no effect on cued memory retrieval in either 
wild-type (n =6) or knockout (n =7) C57BL/6] male mice (d). Results 

(c, d) were normalized to vehicle control; see also Extended Data Fig. 8d-i. 


GPR68 PAMs, lacking intrinsic activity but increasing agonist potency 
(a-factor) for cAMP production by 1.9-8.2-fold, and increasing efficacy 
(B-factor) by 1.1-5.6-fold (Supplementary Table 5). It is this ability to 
shift concentration-response curves leftward and upward (Extended 
Data Fig. 4b) that are the key characteristics of a PAM. ZINC67740571 
had a much higher allosteric effect than lorazepam (Fig. 3b versus 
Fig. 1d, and Supplementary Table 8); we denoted it ‘ogerim (for OGR1 
ligand). 

Ogerin and ZINC32547799 are close analogues (Fig. 3a), but 
each had distinct functional activities (Fig. 4a and Extended Data 
Fig. 4f, g) and docking poses (Fig. 2 and Extended Data Fig. 2h). Thus, 
the ortho-hydroxylmethyl group, which differentiates them, may have 
a key role in determining PAM activity, perhaps because of its ability 
to hydrogen-bond with Glu160, which the meta-positioned hydroxyl- 
methyl in ZINC32547799 cannot reach. The structure-guided mutants 
His269Phe and Arg189Leu responded to ogerin and ZINC32547799 
differently (Fig. 4a, Extended Data Fig. 4f, g and Supplementary 
Table 6), supporting the modelled interactions with these residues. 
Notably, rather than activating, ogerin inhibited proton-mediated 
calcium release—a pathway-specific function rescued in Arg189Leu 
and His269Phe (Fig. 4b, Extended Data Fig. 4h, i and Supplementary 
Table 7). Meanwhile, ZINC32547799 had little effect on calcium 
release. To determine whether fast kinetics affect the difference 
between cAMP measurement (under equilibrium) and calcium 
release (non-equilibrium), we also conducted phosphatidylinositol 
hydrolysis assays under equilibrium. Ogerin slightly potentiated pro- 
ton activity here (Extended Data Fig. 4j, k), whereas ZINC32547799 
did not. Furthermore, ogerin had minimal PAM activity at the related 
proton-sensing GPCRs, GPR4 and GPR65 (Extended Data Fig. 2c, d). 
Ogerin seems to be a functionally selective GPR68 PAM for the agonist 
proton. 

If the ogerin-GPR68 model is relevant, we should be able to leverage 
it for optimization. We designed a virtual library of more than 600 
ogerin analogues and docked each into the GPR68 model (Extended 
Data Fig. 2h, i). Thirteen high-scoring analogues were synthesized, 


480 | NATURE | VOL 527 | 26 NOVEMBER 2015 


and three were more active than ogerin (Supplementary Table 9 and 
Extended Data Fig. 6), including the first and seventh ranked com- 
pounds, the latter of which, C2, had the greatest allosteric effect, shift- 
ing the proton response threefold further to the left than does ogerin, 
for an a-factor of 22 (Fig. 3a—c and Supplementary Table 8). C2 differs 
from ogerin by the addition of a methylene to the benzylamine side 
chain, which places the phenyl ring deeper into a modelled apolar 
pocket (Extended Data Fig. 2i). The addition of one or two further 
methylenes in compounds C3 and C4 (Fig. 3a), conversely, reduced 
allostery (Supplementary Table 8 and Extended Data Fig. 6f), consistent 
with reduced complementarity to the apolar pocket in the modelled 
complex. 

To investigate ogerin specificity for GPR68 over unrelated targets, 
which might affect its usefulness as a biological probe, we first com- 
putationally screened ogerin and its analogues for off-targets using 
the Similarity Ensemble Approach (SEA) program*! against a panel of 
2,800 targets. These calculations revealed similarity between the GPR68 
ligands and those of only three other GPCRs: the ghrelin and adenosine 
A, and Ajaq receptors. Subsequent physical profiling against 58 GPCRs, 
ion channels and transporters (Extended Data Fig. 3) revealed that 
ogerin had moderate affinity at two GPCRs, 5-hydroxytryptamine 2B 
(5-HT2,) and the Az, receptor (Extended Data Fig. 5h, i), the latter 
consistent with the SEA prediction. 

Intrigued by the association between the GPR68 PAMs and aden- 
osine receptor antagonists, we computationally screened a library 
(http://www.tocris.com/dispprod.php?ItemId=5386#.U_s5ZMVdUrU) 
of 1,120 reagents and drugs against the GPR68 ligands, again using 
SEA. SLV320, a selective adenosine A, antagonist*?, was predicted to 
be a GPR68 PAM and confirmed by a physical screen of the full library 
(SLV320 af =2.8) (Extended Data Fig. 7 and Supplementary Table 8), as 
was a second adenosine receptor antagonist, CGH2466 (af = 2.9), and 
tracazolate (aB = 3.4), a GABAergic (GABA-mediated) drug that also 
antagonizes adenosine receptors*’. Although CGH2466 has the lowest 
apparent binding constant (Kg) ofany GPR68 PAM (48 nM), its allostery 
is much lower than that of ogerin; additionally, like SLV320 and traca- 
zolate, CGH2466 is a potent phosphodieseterase inhibitor (Extended 
Data Fig. 7) and had minimal activity in the presence of Ro 20-1724. 
This previously unknown cross-talk among the GPR68, adenosine and 
GABA receptor ligands (Extended Data Fig. 7d), along with their activ- 
ities at phosphodiesterases, should be considered when evaluating the 
pharmacology of what have been considered specific probes and drugs. 


Ogerin as a GPR68 probe 

Given its activity and specificity, we sought to explore the downstream 
signalling and in vivo activity of ogerin. In GPR68-expressing HEK293 
cells, we found that both ogerin and lorazepam activate the protein 
kinase A (PKA) and mitogen-activated protein (MAP) kinase path- 
ways (Extended Data Fig. 8a), mimicking the low pH-induced signal- 
ling observed with GPR68 receptors in human airway smooth muscle 
cells’®. The activation of GPR68 in smooth muscle cells by extracellular 
acidification is linked to several downstream pathways and biological 
responses!®!9.22,34-37, which a selective allosteric modulator, such as 
ogerin, may help to disentangle. 

To investigate effects in behaviour associated with modulation of 
the hippocampus, where GPR68 is highly expressed’, we evaluated 
GPR68-knockout and wild-type mice in a learning and memory test, 
fear conditioning, in which the hippocampus has important roles 
(Extended Data Figs 8 and 11). In wild-type mice, ogerin attenuated 
contextual-based fear memory without effects on cue-based memory 
(Fig. 4c, d). The magnitude of these effects is comparable to those 
of compounds targeting other hippocampus-expressed GPCRs**?, 
and larger effects are rarely observed without surgical lesion of the 
hippocampus”. Crucially, the administration of ogerin had no effect 
on memory retrieval in GPR68-knockout mice (Fig. 4c, d), indicating 
that the in vivo effects of ogerin are GPR68-dependent. Furthermore, 
the less active ogerin isomer, ZINC32547799, had no measurable effect 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a Asp153 z b a” 
Ser90 ‘ Ser90 
Metoa & Thr66 *Met9 4 & yes 
ri Tyr63 
“~ 
. : 
argia7 %S Trp70 ~ : Trp70 
valere Phe242 aii 
Enoese Ty272 ‘Arg273 Tyr272 Arg273 
d 6.0) = 
B15 9.0) © GPR65, BTBO9089 
| HI PH8.40 5 1.0 becumerey— o> lm GPR68, BTB09089 
S HM pH7.40 2 = @ HEK293T, BTB09089 
8 40 Hi pH7.00 2°85 % 7-01 © GPR65, 13684400 
a M@ pH6.80 > 0.0 a GPR68, 13684400 
2 3.0 fs > o © HEK293T, 13684400 
Sas 2 o> 5.04 
[e) e P i) nel 
© oF 2 
> 2.04 OS > 
ia = 3.0 
orf ih iii r : 
0.0 uo 1.04 ie 
S , 2 Pc6 : ; 
ELH > Ax P SPP Wok. 359 #8 . +5 5 4 
MPH MMH Pgh oe Se 
POY SEY SOY SKM 
IS PS @ co) log[drug] 
f g 
207 62678696 (um) !09Ka: 5-32 Asp153 
logKg: -4.71 
ae Tq: 2.02 ies Trp70 
= wi ae 
$15) 43 73: 0 
zs 10 loga: 0 Arg187 
5 + 30 logB: -1.42 ag ae 
30 10 Hill: 2.82 
2 
D 
z 5 
0 ss 
F T T 
Basal -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 Phe239 


log[BTB09089] 
Figure 5 | Discovery of GPR65 allosteric agonist and negative allosteric 
modulator. a—c, Predicted interactions of BTB09089 (a), ZINC13684400 
(b) and ZINC62678696 (c) with GPR65. Overlaid ogerin (thin magenta 
lines) (a) or BTB09089 (thin blue lines) docking poses with GPR68 or 
GPR65, respectively (b, c). d, ZINC13684400 (30 uM) displayed GPR65 
allosteric agonist activity at pH 8.40 but not at lower pH or in control cells 
(n= : measurements). e, ZINC13684400 as a GPR65 agonist at pH 8.40 
(n= 3). f, ZINC62678696 shifts BTB09089 curves downward at pH 8.40 


on learning and memory in wild-type mice (Fig. 4c, d and Extended 
Data Fig. 8d-i). The effects of ogerin thus support a role for GPR68 in 
hippocampal-associated memory. 


General applicability of the approach 

To explore the broader usefulness of this approach, we sought ligands 
for GPR65, another understudied pH-sensing receptor, which shares 
37% sequence identity to GPR68. We found that a recently reported 
GPR65 agonist BTB09089 (ref. 41) is an allosteric agonist of GPR65 
(Fig. 5d, e and Extended Data Fig. 10a). We used BTB09089 to anchor 
modelling of GPR65, generating 500 homology models templated on 
GPR68. The final docked GPR65-BTB09089 model resembles that of 
GPR68-ogerin, with several side-chain substitutions in the putative 
binding site (Fig. 5a). 

We docked the same 3.1-million compounds against the GPR65 
model, purchasing 45 new molecules for testing (Fig. 5a-c and 
Supplementary Table 10). ZINC13684400 showed agonist activity 
of more than twofold of basal at GPR65, with a potency of 500 nM, 
without measurable activity at control cells (Fig. 5e and Extended Data 
Fig. 9). As with BTB09089, ZINC13684400 did not potentiate proton 
efficacy at GPR65 (Fig. 5d), but acted as an allosteric agonist. To test 
the model, three residues modelled to interact with both BTB09089 
and ZINC 13684400, Arg187, Phe242 and Tyr272, were mutated, as was 
Asp153, which appears to only hydrogen-bond with ZINC13684400 
(Fig. 5b). Arg187Leu, Phe242Ala and Tyr272Ala reduced the activity 


(n=4). Ka and Kg are the equilibrium binding affinities of the orthosteric 
agonist proton (A) and allosteric modulator (B), respectively. Normalized 
results (d-f) are mean + s.e.m., and curves were analysed using a four- 
parameter logistic function (e) or a standard operational allosteric model 
(f). g, Predicted ternary complex between GPR656, ZINC62678696 and 
BTB09089, detailed interactions (left) and overall orientation in the 
GPR65 structure (right). 


of both compounds (Extended Data Fig. 10f, g), whereas Asp153Ala 
had no effect on BTB09089 but much reduced the activity of 
ZINC13684400, consistent with the model. Several other docking hits 
inhibited GPR65 when the receptors were activated by protons or by 
BTB09089, including ZINC62678696 (Extended Data Fig. 10b-d). 
Unexpectedly, ZINC62678696 does not compete with BTB090839, as 
predicted, but rather acts as a BIB09089 negative allosteric modula- 
tor (Fig. 5f), suggesting that the two molecules can bind to GPR65 
simultaneously (Fig. 5g). 


Discussion 

A combined empirical and structure-based approach discovered potent 
PAMs at the understudied receptor GPR68, and an allosteric agonist 
and negative allosteric modulators for the understudied GPR65. This 
supports the usefulness of the approach for illuminating the ‘dark 
matter’ of the GPCRs—the 38% of non-olfactory GPCR targets whose 
ligands and function are understudied or unknown!. Whereas truly 
high-throughput screens are impractical for targets of unknown func- 
tion, lower-throughput screens are often feasible. Although the hits 
from such a screen may be unsuitable as probes, they can anchor com- 
putational screens for more optimized compounds. Correspondingly, 
we would not ordinarily expect docking to succeed against models of 
a target that shares only 29% sequence identity with its nearest tem- 
plate. By calculating several thousand models, and insisting that the 
relevant ones are those that prioritize active over inactive molecules, 


26 NOVEMBER 2015 | VOL 527 | NATURE | 481 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


functionally relevant models are prioritized. The new ligands that 
emerged are specific for the target and one is active in vivo, supporting 
their use as chemical probe for the function of GPR68. 

Pharmacologically, the most unexpected observation was the activ- 
ity of GPR68 in learning and memory. Previous studies in GPR68- 
knockout mice revealed only modest phenotypic changes'®?!*, none in 
higher brain function, even though GPR68 is most highly expressed in 
the brain. Ogerin transiently and reversibly reduced contextual-based 
fear memory in wild-type but not GPR68-knockout mice, consistent 
with on-target activity in vivo. In hindsight, this is perhaps only accessi- 
ble to chemical modulators, which can have PAM activities. Inhibitory 
genetic perturbations, such as knockouts or knockdowns, although 
crucial to demonstrating on-target activity through chemical genetic 
epistasis, cannot on their own reveal such activation-based modulation. 

Deorphanizing a receptor can also illuminate its off-target roles for 
known drugs. The observation that lorazepam and its primary metab- 
olite, desmethyldiazepam, are GPR68 PAMs may clarify several of the 
idiosyncratic effects of this widely used anxiolytic. Lorazepam, uniquely 
among benzodiazepines, can treat catatonia, an effect proposed to 
involve an unknown secondary target”. GPR68 may have a role in this 
efficacy, as both drug and metabolite reach micromolar concentrations 
in plasma during treatment’. 

Certain caveats bear airing. The combination of empirical and com- 
putational screens will not work for all orphan receptors. GPCRs that 
are poorly expressed or non-functional in yeast or transfected cells 
will be problematic, and some orphans will simply not recognize any 
of the molecules screened in the small empirical libraries. Also, some 
orphans will bear too little similarity to templates of known structure 
to support accurate modelling. Even those that do work will demand 
cycles of testing and optimization, which was crucial for both GPR65 
and GPR68. 

These cautions should not obscure the key observations from this 
study—that combining empirical and structure-based screening led to a 
probe molecule that reveals some of the functions of GPR68. The find- 
ing that ogerin potentiates GPR68 activation and downstream MAP 
kinase pathways, and previous observations that the receptor mediates 
airway inflammation, enables campaigns for GPR68 PAMs that may 
regulate respiratory inflammatory responses. Uniquely as PAMs, these 
compounds would have fidelity to the natural spatial and temporal 
activation of GPR68. Correspondingly, the role of GPR68 in anxiety 
offers a new route to treating this condition and related CNS disorders, 
an area in need of new therapeutic modalities’. Methodologically, this 
approach may have broad application to illuminating the function of 
the dark matter of the genome, that still large area of pharmacology in 
which targets are known, but function is hidden. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 2 January; accepted 4 September 2015. 
Published online 9 November 2015. 


1. Roth, B. L. & Kroeze, W. K. Integrated approaches for genome-wide 
interrogation of the druggable non-olfactory G protein-coupled receptor 
superfamily. J. Biol. Chem. 290, 19471-19477 (2015). 

2. Davenport, A. P. et a/. International Union of Basic and Clinical Pharmacology. 
LXXXVIII. G protein-coupled receptor list: recommendations for new pairings 
with cognate ligands. Pharmacol. Rev. 65, 967-986 (2013). 

3. Chung, S., Funakoshi, T. & Civelli, O. Orphan GPCR research. Br. J. Pharmacol. 
153 (suppl. 1), S339-S346 (2008). 

4. Knapp, S. et al. A public-private partnership to unlock the untargeted kinome. 
Nature Chem. Biol. 9, 3-6 (2013). 

5. Ferguson, F. M. et al. Targeting low-druggability bromodomains: fragment 
based screening and inhibitor design against the BAZ2B bromodomain. 

J. Med. Chem. 56, 10183-10187 (2013). 

6. Leung, D., Hardouin, C., Boger, D. L. & Cravatt, B. F. Discovering potent and 
selective reversible inhibitors of enzymes in complex proteomes. Nature 
Biotechnol. 21, 687-691 (2003). 

7. Ludwig, M. G. et al. Proton-sensing G-protein-coupled receptors. Nature 425, 
93-98 (2003). 


482 | NATURE | VOL 527 | 26 NOVEMBER 2015 


10. 


11. 


12. 


13. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 
2d: 


28. 


29. 


30. 


31. 
32. 


33. 


34. 


35. 


Mogi, C. et al. Sphingosylphosphorylcholine antagonizes proton-sensing 
ovarian cancer G-protein-coupled receptor 1 (OGR1)-mediated inositol 
phosphate production and cAMP accumulation. J. Pharmacol. Sci. 99, 
160-167 (2005). 
Li, J. et al. Ovarian cancer G protein coupled receptor 1 suppresses cell 
migration of MCF7 breast cancer cells via a Gal2/13-Rho-Racl pathway. 
J. Mol. Signal. 8, 6 (2013). 
Singh, L. S. et a/. Ovarian cancer G protein-coupled receptor 1, a new 
metastasis suppressor gene in prostate cancer. J. Nat/. Cancer Inst. 99, 
1313-1327 (2007). 
Schneider, J. W. et al. Coupling hippocampal neurogenesis to brain pH through 
proneurogenic small molecules that regulate proton sensing G protein-coupled 
receptors. ACS Chem. Neurosci. 3, 557-568 (2012). 

Frick, K. K., Krieger, N. S., Nehrke, K. & Bushinsky, D. A. Metabolic acidosis 
increases intracellular calcium in bone cells through activation of the proton 
receptor OGR1. J. Bone Miner. Res. 24, 305-313 (2009). 

Komarova, S. V., Pereverzev, A., Shum, J. W., Sims, S. M. & Dixon, S. J. 
Convergent signaling by acidosis and receptor activator of NF-«B ligand 
(RANKL) on the calcium/calcineurin/NFAT pathway in osteoclasts. Proc. Nat! 
Acad. Sci. USA 102, 2643-2648 (2005). 


. Yang, M. et al. Expression of and role for ovarian cancer G-protein-coupled 


receptor 1 (OGR1) during osteoclastogenesis. J. Biol. Chem. 281, 23598-23605 
(2006). 

Russell, J. L. et al. Regulated expression of pH sensing G protein-coupled 
receptor-68 identified through chemical biology defines a new drug target for 
ischemic heart disease. ACS Chem. Biol. 7, 1077-1083 (2012). 

ohebbi, N. et a/. The proton-activated G protein coupled receptor OGR1 
acutely regulates the activity of epithelial proton transport proteins. Cell. 
Physiol. Biochem. 29, 313-324 (2012). 

Regard, J. B., Sato, I. T. & Coughlin, S. R. Anatomical profiling of G protein- 
coupled receptor expression. Cel! 135, 561-571 (2008). 

Chen, Y. J., Huang, C. W., Lin, C. S., Chang, W. H. & Sun, W. H. Expression and 
unction of proton-sensing G-protein-coupled receptors in inflammatory pain. 
Mol. Pain 5, 39 (2009). 


19. Saxena, H. et al. The GPCR OGR1 (GPR68) mediates diverse signalling and 


contraction of airway smooth muscle in response to small reductions in 
extracellular pH. Br. J. Pharmacol. 166, 981-990 (2012). 

Wang, J., Sun, Y., Tomura, H. & Okajima, F. Ovarian cancer G-protein-coupled 
receptor 1 induces the expression of the pain mediator prostaglandin E2 in 
response to an acidic extracellular environment in human osteoblast-like cells. 
Int. J. Biochem. Cell Biol. 44, 1937-1941 (2012). 

Li, H. et al. Abnormalities in osteoclastogenesis and decreased tumorigenesis 
in mice deficient for ovarian cancer G protein-coupled receptor 1. PLoS ONE 4, 
e5705 (2009). 

Aoki, H. et a/. Proton-sensing ovarian cancer g protein-coupled receptor 1 on 
dendritic cells is required for airway responses in a murine asthma model. 
PLoS ONE 8, e79985 (2013). 

Mogi, C., Nakakura, T. & Okajima, F. Role of extracellular proton-sensing OGR1 
in regulation of insulin secretion and pancreatic B-cell functions. Endocr. J. 61, 
101-110 (2013). 

Okajima, F. Regulation of inflammation by extracellular acidification and 
proton-sensing GPCRs. Cell. Signal. 25, 2263-2271 (2013). 

Dong, S., Rogan, S. C. & Roth, B. L. Directed molecular evolution of DREADDs: 
a generic approach to creating next-generation RASSLs. Nature Protocols 5, 
561-573 (2010). 

Mysinger, M. M. & Shoichet, B. K. Rapid context-dependent ligand desolvation 
in molecular docking. J. Chem. Inf. Model. 50, 1561-1573 (2010). 

Evers, A. & Klebe, G. Ligand-supported homology modeling of G-protein- 
coupled receptor sites: models sufficient for successful virtual screening. 
Angew. Chem. Int. Ed. Engl. 43, 248-251 (2004). 

Cavasotto, C. N. et al. Discovery of novel chemotypes to a G-protein-coupled 
receptor through ligand-steered homology modeling and structure-based 
virtual screening. J. Med. Chem. 51, 581-588 (2008). 

Katritch, V., Rueda, M., Lam, P. C.-H., Yeager, M. & Abagyan, R. GPCR 3D 
homology models for ligand screening: lessons learned from blind 
predictions of adenosine A2a receptor complex. Proteins 78, 197-211 
(2010). 

Leach, K., Sexton, P. M. & Christopoulos, A. Allosteric GPCR modulators: taking 
advantage of permissive receptor pharmacology. Trends Pharmacol. Sci. 28, 
382-389 (2007). 

Keiser, M. J. et al. Relating protein pharmacology by ligand chemistry. 

Nature Biotechnol. 25, 197-206 (2007). 

Kalk, P. et al. The adenosine Al receptor antagonist SLV320 reduces 
myocardial fibrosis in rats with 5/6 nephrectomy without affecting blood 
pressure. Br. J. Pharmacol. 151, 1025-1032 (2007). 

Thompson, S.-A., Wingrove, P. B., Connelly, L., Whiting, P. J. & Wafford, K. A. 
Tracazolate reveals a novel type of allosteric interaction with recombinant 
s-aminobutyric acida receptors. Mol. Pharmacol. 61, 861-869 

(2002). 

Tomura, H. et a/. Prostaglandin l2 production and cAMP accumulation in 
response to acidic extracellular pH through OGR1 in human aortic smooth 
muscle cells. J. Biol. Chem. 280, 34458-34464 (2005). 

Ichimonji, |. et a/. Extracellular acidification stimulates IL-6 production and 
Ca?+ mobilization through proton-sensing OGR1 receptors in human airway 
smooth muscle cells. Am. J. Physiol. Lung Cell. Mol. Physiol. 299, L567-L577 
(2010). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


36. Liu, J. P. et a/. Ovarian cancer G protein-coupled receptor 1-dependent Program (NIMH PDSP) (X.-PH., H.Z., M.S.F., W.K.K., TJ.M., AJ. and B.L.R.), the 
and -independent vascular actions to acidic pH in human aortic smooth Michael Hooker Chair for Protein Therapeutics and Translational Proteomics 
muscle cells. Am. J. Physiol. Heart Circ. Physiol. 299, H731-H742 (2010). to B.L.R.; Genentech Foundation Predoctoral Fellowship (J.K.); NIH grants 

37. Matsuzaki, S. et al. Extracellular acidification induces connective tissue GM59957 and GM71896 (B.K.S.) and the Structural Genomics Consortium 


growth factor production through proton-sensing receptor OGR1 in human (B.K.S.); grant PO1l HL114471 (R.B.P. and D.A.D.); NICHD grant U54 HDO79124 
airway smooth muscle cells. Biochem. Biophys. Res. Commun. 413, 499-503 (M.S.M., K.A.S., V.N.); NIH grant U19MH082441 (B.LR., J.J. and X.C.). We thank 


(2011). Mark Pausch (Merck & Co.) for providing us G,- and G,-yeast strains for yeast 

38. Gravius, A. Barberi, C., Schafer, D., Schmidt, W. J. & Danysz, W. The role of screening assays. 
group | metabotropic glutamate receptors in acquisition and expression of 
contextual and auditory fear conditioning in rats - a comparison. Author Contributions X.-P.H. subcloned GPR68 for yeast screening, made 
Neuropharmacology 51, 1146-1155 (2006). GPR68 and GPR65 mutants, designed, carried out cell-based screening assays, 

39. Daumas, S. et al. Transient activation of the CA3 Kappa opioid system in the analysed results, and wrote the paper. J.K. designed and developed homology 
dorsal hippocampus modulates complex memory processing in mice. models, carried out docking screens, analysed results, and wrote the paper. 
Neurobiol. Learn. Mem. 88, 94-103 (2007). W.K.K. set up and performed yeast screening assays, analysed results, and 

40. Phillips, R. G. & LeDoux, J. E. Differential contribution of amygdala and wrote the paper. H.Z. and M.S.F. designed, performed in vivo fear-conditioning 
hippocampus to cued and contextual fear conditioning. Behav. Neurosci. 106, studies, analysed results, and wrote the paper. M.S.F. and B.L.R. dubbed 
274-285 (1992). ZINC67740571 ‘ogerin’. B.H.K. created the GPR68-knockout mice. S.S.M., 

41. Onozawa, Y. et al. Activation of T cell death-associated gene 8 regulates the K.A.S. and V.N. carried out initial phenotypic characterization, analysed results, 
cytokine production of T cells and macrophages in vitro. Eur. J. Pharmacol. 683, and wrote the paper. X.C. and J.J. synthesized ZINC32547799, ZINC67740571 
325-331 (2012). (ogerin) and ogerin analogues (compounds 33548-33561, C3 and C4) for 

42. Pompéia, S., Manzano, G. M., Tufik, S. & Bueno, O. F. What makes lorazepam functional assays and in vivo studies, and wrote the paper. T.J.M. carried out 
different from other benzodiazepines? J. Physiol. (Lond.) 569, 709 (2005). radioligand binding assays. A.J. prepared drug plates and plasmids for initial 

43. Greenblatt, D. J. et al. Clinical pharmacokinetics of lorazepam. |. Absorption screening. R.B.P. and D.A.D. designed and carried out anti-haemagglutinin 
and disposition of oral 14C-lorazepam. Clin. Pharmacol. Ther. 20, 329-341 immunoblot assays, analysed results, and wrote the paper. S.W. designed 
(1976). primers, prepared Flag-tagged GPR68 wild-type and mutant plasmids, 

44. Schoepp, D. D. Where will new neuroscience therapies come from? Nature Rev. performed anti-Flag western blot assays, and analysed results. T.K. analysed 
Drug Discov. 10, 715-716 (2011). results and wrote the paper. B.L.R. and B.K.S. coordinated and supervised the 

project, and with the other authors wrote the paper. 

Supplementary Information is available in the online version of the paper. Additional Information Reprints and permissions information is available 

at www.nature.com/reprints. The authors declare no competing financial 
Acknowledgements This work was supported by National Institutes of Health interests. Readers are welcome to comment on the online version of the paper. 


(NIH) grants U01104974 (B.L.R., B.K.S. and W.K.K.), RO1 DAO17204 (B.L.R. and Correspondence and requests for materials should be addressed to 
W.K.K.) and the National Institute of Mental Health Psychoactive Drug Screening —_B.L.R. (bryan_roth@med.unc.edu) or B.K.S. (bshoichet@gmail.com). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 483 
© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


METHODS 


Chemicals, reagents and cells lines. Chemicals and reagents used in this study, if 
not specified otherwise, were purchased from commercial sources (Sigma, Tocris, 
Fisher Scientific, or specified in Supplementary Tables 2 and 3 of chemical struc- 
tures) or synthesized as outlined in the Supplementary Information. HEK293 
(ATCC CRL-1573; 60113019; certified mycoplasma free and authentic by ATCC) 
and HEK293-T (HEK293T; ATCC CRL-11268; 59587035; certified mycoplasma 
free and authentic by ATCC) cells were from the ATCC. Cells were also validated 
by analysis of short tandom repeat (STR) DNA profiles and these profiles showed 
100% match at the STR database from ATCC. Ogerin and its inactive analogue 
ZINC32547799 are available for use as chemical probes from Sigma-Aldrich 
(ogerin: SML1482, ZINC32547799: SML1483). 

Homology modelling. The alignment for the construction of the GPR68 mod- 
els was generated using PROMALS3D, and homology models were built with 
MODELLER-9v8 (ref. 45), using the crystal structure of the chemokine CXCR4 
receptor (PDB code 30DU) as the template (Extended Data Fig. 2f). This align- 
ment was also used to generate 500 models of GPR65 directly from the final 
GPR68 model. The initial alignment included both human and mouse sequences 
of GPR68, as well as those of its closest homologue, GPR4. These were aligned 
against the whole human C-X-C chemokine receptor family. The alignment was 
manually edited to: remove the amino and carboxy termini that extended past the 
template structure, remove the engineered T4 lysozyme, and create different align- 
ments of the flexible and non-conserved second extracellular loop (the final result 
is given in the provided alignment, Extended Data Fig. 2f). A total of 407 models 
were built directly based on the CXCR4 crystal structure, using MODELER-9v8 
(ref. 45), while five more were built from each of 580 elastic network models 
(ENMs), produced by the program 3K-ENM*“*, for a total of 3,307 models built 
during each iterative round of model refinement. Models with constraints between 
pairs of extracellular His residues (His17—His169, His17—His269, His17—His84 
and His84-His169) to mimic the inactive state of the protein were generated by 
enforcing a distance constraint of 2.7 A between the imidazole nitrogens, with a 
standard deviation of 0.1 A. Confirmed active compounds and analogues using 
CXCR4-based model had neither agonist nor antagonist activity at CKCR4 recep- 
tors (Extended Data Fig. 5j, k). 

Model evaluation. Before docking, the second extracellular loop (EL2), between 
residues 161-177, was removed from each GPR68 model. Models were ranked on 
the basis of prioritizing active benzodiazepines (lorazepam and desmethyldiaze- 
pam) over the rest of the inactive NCC library that was used in the yeast screen, as 
well as over property-matched decoys. In addition, the docked pose of lorazepam 
had to form a hydrogen bond from its N-H group to a polar side chain in GPR68. 
Five different sites were sampled for possible lorazepam binding, based on the 
locations of the co-crystallized CXCR4 small molecule antagonist 1T1t (in PDB 
code 30DU), cyclic peptide CVX15 (in PDB code 3OE0), and the positions of the 
biogenic amines crystallized with the B.-adrenergic receptor (PDB code 2RH1) and 
the dopamine D3 receptor (PDB code 3PBL). The entire NCC library was docked 
to each of the five sub-sites for several rounds of iterative binding site refinement. 
In each round, the top-ranked models were examined for a binding pose that made 
hydrophobic and electrostatic interactions with the receptor, including the key 
N-H hydrogen bond. Residues within 6 A of the lorazepam pose were minimized 
around the docked ligand with PLOP*”. The NCC library was then re-docked 
into this optimized binding site for each model. This refinement continued for 
several cycles until the top-ranked models all converged to the same lorazepam 
pose. Once the final model was chosen, we built the EL2 back onto the receptor 
using MODELLER-9Vv8 (ref. 45) and optimized 1,000 different EL2 conforma- 
tions around the lorazepam pose with PLOP. Finally, we docked the NCC library 
back into these 1,000 different EL2-GPR68 structures, and chose a final model 
that retained the previous pose and prioritized the active over the inactive com- 
pounds. The GPR65 model was generated similarly, using the pose of BTB09089 
as the primary selection criterion, although in this case the EL2 was always pres- 
ent. To determine the ternary complex model of ZINC62678696 and BTB09089, 
ZINC62678696 was docked to the putative binding site in the GPR65 model with 
BTB09089 present. Then, both ligands were minimized with PLOP. Next, the side 
chains of the GPR65 binding pocket were allowed to relax, and, finally, BTB09089 
and ZINC62678696 were simultaneously minimized again with PLOP. Structural 
models (PDB files) of characteristic GPR68-modelled complexes (with ogerin or 
lorazepam) and GPR65-modelled complexes (with BTB09089 or BTB09089 and 
ZINC62678696) are shown in the Supplementary Data. 

Virtual screens. We used DOCK 3.6 to screen the ZINC database (Results). The 
flexible ligand sampling algorithm in DOCK 3.6 superimposes atoms of the docked 
molecule onto binding site matching spheres, which represent favourable posi- 
tions for individual ligand atoms. Forty-five matching spheres were used, using 
the previous refinement round’s pose of lorazepam. The degree of ligand sampling 
is determined by the bin size, bin size overlap and distance tolerance, set at 0.4A, 


0.1A and 1.5A, respectively, for both the matching spheres and the docked 
molecules. The complementarity of each ligand pose was scored as the sum of the 
receptor-ligand electrostatic and van der Waals’ interaction energies, and corrected 
for context-dependent ligand desolvation. Partial charges from the united-atom 
AMBER force field were used for all receptor atoms; ligand charges and initial 
solvation energies were calculated using AMSOL**? (http://comp.chem.umn. 
edu/amsol/). The best-scoring conformation of each docked molecule was then 
subjected to 100 steps of rigid-body minimization. 

Selection of potential ligands for testing. We docked the approximately 3.1 
million commercially available molecules of the lead-like subset of the ZINC data- 
base to the final GPR68 and GPR65 models. The full hit list was automatically 
filtered to remove molecules that possess high-internal-energy, non-physical con- 
formations, which are not well-modelled by our scoring function. The reported 
rankings reflect this filtering. From the top 0.1% (~3,000 molecules) of the docked 
ranking list, 17 compounds were chosen for testing, based on complementarity to 
the binding site and presence of predicted electrostatic interactions with Glu160, 
Arg189, Tyr244, Tyr268 and His269, mimicking those predicted for lorazepam. For 
GPR65, compounds were chosen based on complementarity to the binding site and 
similarity to the predicted binding pose of BTB09089, modelled to interact with 
Asp153, Arg187 and Tyr272, and by aromatic stacking with Trp70. 

In silico lead profiling. To examine specificity and to discover other potential 
GPCR targets for the newly discovered GPR68 PAMs, we used the SEA pro- 
gram?!*°, which compares individual ligands and sets of ligands to the ligand sets 
for multiple targets; two targets are related, or a particular ligand is predicted to 
modulate a target, if the sets of ligands are related to one another. Here, the query 
set was all of the new GPR68 PAMs, which was screened against either the 2,512 
ligand—target set with activity of 10 uM or better from the ChKEMBL12 database”’, 
or against the Tocris Mini library. 

Receptor constructs and yeast growth assays. Twenty-four human GPCR plas- 
mids (GPR1, GPR4, GPR15, GPR31, GPR39, GPR41, GR43, GPR45, GPR55, 
GPR57, GPR58, GPR62, GPR65, GPR68, GPR83, GPR84, GPR87, GPR88, 
GPR123, GPR132, GPR133, GPR157, GPR161 and ADCYAP1R1) were obtained 
from http://cdna.org, subcloned into the multiple cloning site of the yeast high copy 
number plasmid p426GPD (ref. 52) and were confirmed by full-length sequencing 
(Eton Bioscience). The yeast strains used were provided by M. Pausch (Merck) 
and have been previously described® and used by us”>*! ; MPY578t (G; yeast), 
MPY578q5 (Gg yeast) and MPY578s5 (G, yeast) express chimaeric G proteins 
in which the last five amino acids of the yeast Ga protein are replaced with their 
mammalian G;, G, or G; homologues, respectively. These strains contain the HIS3 
gene under the control of the FUS1 promoter. GPCR transformants in yeast were 
selected and maintained on synthetic defined (SD) media lacking uracil (Clontech). 
GPR68, indicates the GPR68 paired with G, yeast; while GPR4, indicates GPR4 
paired with G, yeast, and similarly for the other GPCRs. The yeast screening assays 
were carried out as described previously*®. Assays were set up in 96-well flat- 
bottom clear assay plates that contained 50 ul of test compound at 40 1M (final con- 
centration of 101M, in triplicate) diluted in SD-His-Ura medium (Clontech), 50 ul of 
3-amino-1,2,4-triazole (3-AT) at 4x concentration diluted in SD-His-Ura medium 
(pH 5.4), and 100 ul of yeast cell suspension diluted in SD-His-Ura medium to a 
final A600 nm of 0.02. Growth was at 30°C for 2-5 days. Before measurement of cell 
growth, cells were re-suspended by repeated gentle pipetting to ensure uniform sus- 
pension of cells. Cell growth was measured by absorbance at 600 nm in a microplate 
reader (POLARstar Omega, BMG Biotech). After culling of data from obviously 
contaminated wells, the A¢oo nm Values of each individual well were adjusted as fol- 
lows: 100 (Agoo nm Of test well — Agoo nm of plate median value) to give percentage 
growth stimulation (positive values), or percentage growth inhibition (negative 
value) in the form of mean +s.e.m. of three wells. 

To measure and control constitutive activity or leaky HIS expression, each 

receptor-yeast combination was plated as above in the absence of ligand over a 
range of concentrations of 3-AT. Concentrations of 3-AT that showed moder- 
ate yeast growth (that is, A values of 0.2-0.6) after 2 days at 30°C were used in 
assays for drug screening. To measure concentration-dependent activity, various 
concentrations of cognate ligands diluted in SD-His-Ura medium were incu- 
bated with transformed yeast and appropriate concentrations of 3-AT for 2 days 
at 30°C. 
Site-directed mutagenesis. The GPR68 plasmid was obtained from http://cdna. 
org. Mutation of Glu160Ala, Glu160Lys, Glu160GIn, Arg189Leu, Arg189Met and 
His269Phe in the GPR68 and mutation of Asp153Ala, Arg187Leu, Phe242Ala 
and Tyr272Ala in the GPR65 were introduced with Agilent’s QuikChange II 
site-directed mutagenesis kit and confirmed by sequencing. To tag the receptors 
for comparing receptor expression levels with immunoblotting, Flag epitope tag 
was inserted at the C terminus of the GPR68 wild-type and mutant receptors, also 
using the QuikChange II site-directed mutagenesis kit. Insertion was confirmed 
by sequencing. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Split-luciferase based cAMP reporter assays with proton receptors. GPR4, 
GPR65 and GPR68 plasmids were obtained from http://cdna.org. GPR68 muta- 
tions were made and confirmed as above. Receptor-mediated G, activation was 
measured using a split-luciferase reporter assay (GloSensor cAMP assay, Promega). 
In brief, HEK293T cells were transiently co-transfected with receptor DNA and the 
GloSensor cAMP reporter plasmid (GloSensor 7A). Transfected cells were plated 
in poly-L-Lys-coated 384-well white clear bottom cell culture plates in DMEM 
supplemented with 1% dialysed FBS at a density of 15,000 cells per well in a total 
volume of 40 ul for a minimum of 6h. Before assays, culture medium was removed 
and cells were incubated with luciferin (4mM prepared in drug buffer, pH 8.4) 
for 90 min at 37°C. The drug buffer was made with 1 x HBSS supplemented with 
10mM HEPES and 10mM MES modified from!’. TAPS was added to accommo- 
date higher pH values for some assays; no difference was observed between differ- 
ent buffers under the same pH conditions. Cells plated at pH 8.4 for 6h generated 
the same Ht concentration-response curves as those plated at pH 7.4. To make 
individual pH solutions, the pH was adjusted with NaOH and measured at room 
temperature with a pH 211 Microprocessor pH meter (Hanna Instruments). To 
measure modulator activity under different pH conditions, modulator was mixed 
with pH solutions before adding to cells. To achieve the goal that drug solutions 
were delivered at the correct pH values, luciferin solution was removed from cell 
plates before addition of drug solutions at predetermined pH values. To improve 
solubility for some hydrophobic compounds, 1 mg ml”! BSA was added to drug 
solutions, and it had no effect on Ht concentration-response curves. For Gg 
protein activity (CAMP production), the cell plate was usually incubated at room 
temperature for 20 min before being counted in a luminescence counter. Results 
were analysed using GraphPad Prism. 

Allosteric operational model and data analysis. To estimate allosteric parameters, 
results were fitted to the allosteric operational model*”* as shown in the following 
equation: 


Response = basal + (Ey, — basal) 
(7 s[A] (Kg + 0-8[B]))” 


* (AIK, + K,Ky + Kq[B] + of A][B])” + (t4[A](K, + 1B)" 


In which: 

(1) Response is the measured activity in the form of RLUs for measurement of 
cAMP production. If the results were normalized, the ‘response’ is RLU in fold 
of basal (with buffer control as basal). 

(2) Emax is a system parameter, representing the maximal possible response of the 
system, and this value was normally constrained to the maximal reading of the 
corresponding experiment. 

(3) Basal is the baseline in the absence of test ligand, and is constrained to the 
baseline of the corresponding experiment. If results were normalized to fold 
of basal, the ‘basal’ was usually 1.0. 

(4) [A] and [B] represent concentrations of the orthosteric and allosteric ligands, 
respectively. In the case of GPR68, A is proton. 

(5) Ka and Kg are the equilibrium dissociation constants of the orthosteric 
agonist proton (A) and allosteric modulator (B), respectively. To facilitate 
curve-fitting with the model, Ky is usually fixed to the binding affinity 
determined from traditional radioligand binding assays under the assump- 
tion that the experimentally derived binding affinity is not significantly 
different from the functional affinity under the condition for correspond- 
ing functional assay. Since proton binding affinity is not a measurable 
parameter in this assay system, the proton Kg is therefore constrained to 
the corresponding proton potency (ECs0, the proton concentration for half- 
maximal response) value in the absence of the allosteric ligand, under the 
assumption that the proton potency is not significantly different from its 
binding affinity when the cAMP production assay is carried out. Since pro- 
tons are present at relevant concentrations at physiological pH values, for 
a proton receptor K; is largely a fitting parameter without a clear physical 
meaning. 

(6) The term T, is the orthosteric agonist proton efficacy parameter. Since allosteric 
modulators in this study showed no agonist activity, the allosteric modulator 
efficacy Tx is therefore 0 and not included in the function. 

(7) The term n is the slope factor linking receptor occupancy to response. Steep 
slopes in this study indicated high cooperativity between proton binding and 
receptor activation, probably reflecting the fact that the proton receptors oper- 
ate within a narrow physiological pH range. 

(8) The allosteric parameter « defines the mutual effect between the orthosteric 
agonist A and the allosteric modulator B (a > 1 for increased affinity and a<1 
for reduced affinity); while defines the allosteric effect on agonist efficacy 
(> 1 for increased efficacy and B < 1 for reduced efficacy). 


ARTICLE 


With Ka, basal and Ex constrained to their corresponding values, the para- 
meters Kg, Ta, a, 8 and n are globally shared fitting parameters for a family of 
proton concentration-response curves in the absence and presence of increasing 
concentrations of a test allosteric modulator. With the above settings, most curves 
could be easily fitted to generate reasonable parameters. If Prism could not fit 
the curves, but generated ‘ambiguous fitting’ results, the a value was then manu- 
ally constrained to an initial fitting value and systematically changed with small 
increments or decrements until the highest stable high affinity value (Kg) was 
reached. For GPR65 and GPR68, Kg represents the allosteric binding affinity in the 
absence of protons, which is unmeasurable and thus has little physical meaning. 
The value Kg/(1+-@) represents the binding affinity of an allosteric ligand in the 
presence of protons, which could be estimated experimentally. For convenience, 
we call Kp/(1+a) the ‘Biochemical binding affinity, Kps’ (Supplementary Table 8) 
for an allosteric ligand in the presence of an orthosteric agonist (in this case, H*). 
Calcium mobilization assays. HEK293T cells were transfected and plated into 
poly-L-Lys-coated 384-well black clear bottom cell culture plates in DMEM sup- 
plemented with 1% dialysed FBS, at a density of 15,000 cells in 40 ul per well for 
overnight. Before the assay, medium was removed and cells were loaded with 
Fluo-4 Direct calcium dye (Invitrogen) for 60 min at 37°C in a 5% CO, atmosphere. 
The calcium dye was prepared in drug buffer supplemented with 2.5 mM probe- 
necid, pH 8.0. Proton solutions were made with 1x HBSS, 7mM HEPES, 7mM 
HEPPS and 7 mM MES, and pH was adjusted with NaOH. Drug additions and 
fluorescence intensity measurement were carried out in a FLIPR™?®, which was 
programmed to add drug solutions to cells while recording fluorescence intensity. 
To measure proton concentration-responses, 10 tl of pH pre-determined solutions 
were added to each well (with 20 ul calcium dye) while fluorescence intensity was 
recorded during and after addition for 4 min (one reading per second). The addi- 
tion procedure was configured in such a way (30 ul per second at height of 10 ul 
above cells) that local proton concentrations for cells were essentially the same as 
in the pH working solutions at the moment of addition. Fluorescence intensities 
reached peak values within 30s after drug addition. To determine the effects of 
modulators on proton responses, the protocol was modified slightly. In brief, cells 
were loaded with calcium dye as above, but only at 15 pl per well. The FLIPR™ 784 
was programmed to first add 5 ul of 4x test compound (final concentration of 
10 uM before addition of 10 ul of pH solutions) prepared with the same drug buffer 
at pH 8.0 (buffer alone served as a control). After a total of 10 min of reading 
and incubation, 10 ul of the pH solutions were added and the fluorescence inten- 
sity was recorded exactly the same way as above. Results (fluorescence intensity 
in fold of basal) were exported and analysed in GraphPad Prism. For calcium 
mobilization assays with 5-HT , receptors, HEK293 cells stably expressing human 
5-HT», receptors were used instead of transiently-transfected cells. Cells were 
set up and tested in the same way as above, with 5-HT serving as an agonist control 
(3 pM-30uM), and with 1 nM 5-HT being used in the second addition to deter- 
mine the antagonist activity of ogerin. 

Phosphatidylinositol hydrolysis assay. HEK293T cells were transfected for 24h 
and plated in poly-L-Lys-coated 96-well black clear bottom cell culture plates in 
DMEM supplemented with 10% FBS, at a density of 60,000 cells in 100 ul per 
well. After 5 h, cells were washed with inositol-free DMEM once and labelled with 
3H-inositol (1 Ci per well, PerkinElmer) in inositol-free DMEM supplemented 
with 5% dialysed FBS overnight. On the assay day, labelling medium was removed 
and cells were washed once with assay buffer (1x HBSS, 10mM HEPES, 10mM 
MES and 20 mM LiCl, pH 8.4). To measure drug concentration responses, then 
cells were then incubated with drug solutions at pH 8.4 for 20 min. To measure 
proton concentration responses, the assay buffer was pre-adjusted to desired pH 
values and supplemented with 20 mM LiCl. To measure the effect of ogerin or its 
isomer ZINC32547799 on proton concentration—-response curves, pH solutions 
were supplemented with 20 mM LiCl and 10 uM ogerin or ZINC32547799. The 
premixed drug solutions were added to cells for 20 min. At the end of incubation, 
drug solutions were removed and 40 ul per well of 50 mM ice-cold formic acid 
was added. After incubation at 4°C for 30 min, the acid extracts were transferred 
to polyethylene terephthalate 96-well sample plates (1450-401, Perkin Elmer) and 
mixed with 75 ul (200 ug) YSi RNA binding beads (RPNQ0013, Perkin Elmer). 
The plate was sealed and further incubated at 4°C for 30 min before being counted 
on a TriLux MicroBeta counter. Results (c.p.m. per well) were analysed using 
Graphpad Prism. 

Functional assays with Aj, and CXCR4 receptors. Functional assays with Az, 
adenosine and CXCR4 chemokine receptors were carried out using a slightly dif- 
ferent protocol from that previously described for G, (above) and G; receptors™. 
Specifically, HEK293T cells were transfected and plated using regular DMEM sup- 
plemented with 1% dialysed FBS. Before assays, culture medium was removed, 
and cells were incubated with 20 ul drug solution (prepared in drug buffer 
20mM HEPES, 1x HBSS, pH 7.4) for 15 min at room temperature. To measure 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


agonist activity, 5 ul of 5x luciferin solution (4mM final concentration) for Az, 
(G,-coupled GPCRs) or a mixture of luciferin and isoproterenol at a final con- 
centration of 200nM for CXCR4 (G;-coupled GPCRs) was added and cells were 
incubated for another 20 min. To measure antagonist activity, test compound was 
added first for 10 min before a reference agonist at a final of ECgp concentration for 
another 10 min, and then followed by addition of luciferin for Aj, or a mixture of 
luciferin and isoproterenol for CXCR4 as above. Luminescence was measured in 
a luminescence counter. Results were analysed in GraphPad Prism. 

Radioligand binding assays. Radioligand binding assays with selected CNS tar- 
gets were carried out as described***” and as detailed in the PDSP protocol book 
available online (http://pdsp.med.unc.edu/pdspw/binding.php). In brief, receptor 
membrane preparations were made from either animal brain tissues, or stable cell 
lines, or transiently transfected HEK293T cells. Receptor expression levels and 
radioligand binding affinities were determined with saturation binding assays. 
Competition binding assays were performed with membrane aliquots and a fixed 
concentration of radioligand in 96-well plates in a final volume of 125 ul. Reactions 
were incubated in the dark and at room temperature (22°C), and terminated by 
vacuum filtration onto 96-well formatted GF/B filters. Radioactivity on the filters 
was counted in a beta counter. Results were analysed in GraphPad Prism. 
Anti-HA immunoblots. HEK293 cells were transfected with either pcDNA3 
vector containing a haemagglutinin (HA) cassette within the multiple cloning 
site, or pCDNA3HA-GPR68 encoding human GPR68 with an N-terminal HA tag. 
Stable lines were generated by selection with 250 ug ml! G418, with >90% of cells 
expressing HA after 2 weeks as assessed by immunocytochemistry (not shown). 
Cells were plated into 12-well plates, grown to confluence, and media switched 
to Hams-F12 media, with pH adjusted to pH 8.0 or 7.4, for 1h. Cells were then 
stimulated with vehicle, 50 uM ogerin, or 501M lorazepam for 10 min. Lysates 
were collected and subjected to immunoblotting, with blots probed using primary 
antibodies against HA (Sigma cat H3663), total vasodilator-stimulated phospho- 
protein (VASP, BD Biosciences, cat 610448), p-p42/p44 (Cell Signaling, cat 5726S), 
and B-actin (Sigma, cat A1978), and secondary antibodies (Licor, cat 926-32213 
and 926-32210) conjugated with infrared fluorophores as described previously™. 
Anti-Flag immunoblots. HEK293T cells were transiently transfected in 10-cm 
dishes with Flag-tagged GPR68 wild-type and mutant receptors. Untransfected 
HEK293T cells served as a negative control. After 48 h, cells were collected, lysed 
and sonicated to shear chromatin before being subjected to immunoblotting. Blots 
were probed with monoclonal anti-Flag M2-peroxidase antibody (Sigma, A8952). 
Bands were quantified and normalized to GPR68 wild-type receptor (fold) for 
graphing. 

Data analysis and reporting. Other than in vivo studies (below), no statistical 
analysis was applied to yeast- or cell-based screening assays. Sample size (number 
of assays for each compound or receptor) was predetermined to be in triplicate 
or quadruplicate for primary screening assays at a single concentration. Some 
samples were repeated more than the others in the primary screening assays and 
the number of measurements were specified as a range in corresponding figure 
legends. For concentration-response assays, the sample size (number of assays for 
each compound at selected receptors) was also predetermined to be tested for a 
minimum of three assays, each in triplicate or quadruplicate. Samples or receptors 
were tested not randomly but in an alphabetic order or numeric order according 
to their coded names for easy organization and were thus blinded. For each batch 
of assays, a control assay with isoproterenol and proton concentration-responses 
were included. If potency values for either isoproterenol or proton was >0.5 log 
unit away from established averages, assays with the batch of transfected cells 
were excluded. For structure-activity relationship (SAR) studies, only the assays 
in which all related compounds were tested side by side were included. None of 
the functional assays were blinded to investigators. 

Generation of GPR68-knockout mice. To generate GPR68-knockout mice, 
a probe specific for the human GPR68 transcript was generated by PCR amplifi- 
cation of a 450-base-pair (bp) segment of the coding sequence of the final exon 
of GPR68 using total placental RNA. The probe was used to identify a clone from 
a 129 mouse genomic lambda library. The genomic insert was subcloned and a 
restriction map generated using a panel of enzymes. The targeting construct for 
the GPR68 locus consists of a PGK-1 promoter driven neomycin resistance cassette 
flanked by two arms of homology with the mouse GPR68 locus. The longer arm 
of homology was generated using a 7,266-bp PstI fragment extending from the 
last intron to the beginning of the last exon. This exon contains the entire coding 
sequence of the GPR68 gene. The 1,335-bp shorter arm was generated by PCR 
amplification and extends from the downstream end of the long arm into the 
3’ untranslated region of the gene. Homologous recombination of the targeting con- 
struct with the GPR68 locus inserts the neomycin resistance cassette into codon 78 
of the gene, thereby disrupting expression. Correctly targeted cell lines were identi- 
fied by Southern blot analysis using a probe consisting of a 1,496-bp PstI fragment 


immediately upstream of the long arm. This probe recognizes a 14,290-bp EcoRV 
fragment in the endogenous locus and a 7,855-bp fragment in the targeted locus. 
Genotyping was carried out by PCR with three primers. The common (5’-GCAG 
AGGAAGCCCACGCTGATGTA-3’) and endogenous (5’-TAAACGGTAGCTGT 
GATTATTCAA-3’) primers generate a 516-bp PCR product from the endogenous 
locus, while the common and targeted (5‘-AAATGCCTGCTCTTTACTGAAGG-3’) 
primers generate a 465-bp product from the targeted locus. The chimaeras were 
bred to C57BL/6J mice and pups carrying the mutant allele identified. After ten 
successive crosses of heterozygous animals to C57BL/6J mice, heterozygous mice 
were intercrossed and a congenic Gpr68~/~ and C57BL/6J breeding colony estab- 
lished. The GPR68-knockout mice were profiled in several behavioural tests as 
described below in detail and results are summarized in Extended Data Fig. 11 
and Supplementary Tables 11 and 12. 

In vivo behavioural profiles of GPR68-knockout mice. Mice were maintained 
and handled according to the Guide for the Care and Use of Laboratory Animals 
approved by the Institutional Animal Care and Use Committee of the University 
of North Carolina at Chapel Hill. The goal of this study was to determine whether 
targeted deletion of GPR68 alters behavioural function in mice. 

Timeline for behavioural tests. The following tests were performed with mice at 
the ages shown in parentheses. Elevated plus maze test for anxiety-like behaviour 
(6-7 weeks); activity in an open field, accelerating rotarod (2 tests, 48h apart) 
(7-8 weeks); three-chamber social approach test, activity in an open field (re-test) 
(8-9 weeks); marble-burying assay (9-10 weeks); acoustic startle test, buried food 
test for olfactory ability (10-11 weeks); visual cue test in the Morris water maze 
(11-12 weeks); hidden platform test for spatial learning (12-14 weeks); reversal 
learning in the Morris water maze (14-16 weeks); second acoustic startle test, 
hotplate test for thermal sensitivity (16-17 weeks). 

Summary of results. Mice with deletion of GPR68 had normal performance 
in most of the behavioural tests. No effects of genotype were observed for body 
weights, activity and anxiety-like behaviour in an elevated plus maze or an open 
field, motor coordination, sociability, prepulse inhibition of acoustic startle 
responses or acquisition in the water maze. However, both male and female GPR68- 
knockout mice had small, significant decreases in acoustic startle responses, sug- 
gesting a reduced responsivity to environmental stimuli. Male GPR68-knockout 
mice also showed significant decreases in marble burying, a test for anxiety-like 
phenotypes. Overall, the findings indicate that GPR68 might have a role in specific 
domains of behaviour. 

Elevated plus maze. This test is used to assess anxiety-like behaviour in rodents. 
The procedure is based on a natural tendency of mice to actively explore a new 
environment, versus a fear of being in an open area. In the present study, mice were 
given one 5-min trial on the plus maze, which had two walled arms (the closed 
arms, 20cm in height) and two open arms. The maze was elevated 50 cm from the 
floor, and the arms were 30cm long. Animals were placed on the centre section 
(8 x 8cm), and allowed to freely explore the maze. Measures were taken of time 
on, and number of entries into, the open and closed arms. All of the experimental 
groups showed a strong preference for the closed arms, in comparison to the open 
arms, of the elevated plus maze. As shown in Supplementary Table 11, there were 
no significant differences between the wild-type and GPR68-knockout mice for 
percentage time or percentage entries on the open arms, or for total entries during 
the task. 

Activity in an open field. Exploratory activity in a novel environment was assessed 
in an open field chamber (41 x 41 x 30cm) crossed by a grid of photobeams 
(VersaMax system, AccuScan Instruments). Counts were taken of the number of 
photobeams broken during the trial in 5-min intervals, with separate measures for 
ambulation (total distance travelled) and rearing movements. Time spent in the 
centre region of the open field was measured as an index of anxiety-like behaviour. 
Unfortunately, an equipment malfunction led to the loss of data for 8 mice during 
the first activity test, conducted when mice were 7-8 weeks in age. Therefore, a 
second activity test was given, when mice were 8-9 weeks in age. As depicted in 
Extended Data Fig. 11a, b, there were no significant differences between the wild- 
type and GPR68-knockout mice for distance travelled, or for rearing or centre 
time (data not shown), during the second activity test. A significant sex x time 
interaction was found for the distance measure (F(11,335) = 2.68, P= 0.0025), reflect- 
ing higher levels of activity in the female groups at the beginning of the session. 
Accelerating rotarod test. Subjects were tested for motor coordination and learn- 
ing on an accelerating rotarod (Ugo Basile). For the first test session, animals were 
given three trials, with 45 s between each trial. Two additional trials were given 
48h later. Revolutions per minute (rpm) was set at an initial value of 3, with a 
progressive increase to a maximum of 30 rpm across five minutes (the maximum 
trial length). Measures were taken for latency to fall from the top of the rotating 
barrel. As shown in Extended Data Fig. 11c, d, deletion of GPR68 did not lead 
to deficits in motor coordination on the rotarod. In fact, during the first three 


© 2015 Macmillan Publishers Limited. All rights reserved 


acquisition trials, there was a non-significant trend for enhanced performance in 
the male knockout group (repeated-measures ANOVA, genotype x sex interac- 
tion, Fa,35) =3.58, P= 0.0668). 

Marble-burying assay. This procedure is used to evaluate anxiety-like behav- 
iour and repetitive responses. Mice were tested in a Plexiglas cage located in a 
sound-attenuating chamber with ceiling light and fan. The cage contained 5 cm 
of corncob bedding, with 20 black glass marbles (14 mm diameter) arranged in 
an equidistant 5 x 4 grid on top of the bedding. Animals were given access to the 
marbles for 30 min. Measures were taken of the number of buried marbles (two- 
thirds of the marble covered by the bedding). A two-way ANOVA indicated a 
significant genotype x sex interaction (F(1,35) = 7.37, P= 0.0102) (Supplementary 
Table 11). Post-hoc comparisons revealed that the male GPR68-knockout mice 
buried significantly fewer marbles than both male wild-type mice and female 
knockout mice in this task. 

Buried food test for olfactory function. Several days before the olfactory test, an 
unfamiliar food (Froot Loops, Kellogg Co.) was placed overnight in the home cages 
of the mice. Observations of consumption were taken to ensure that the novel food 
was palatable. Sixteen to twenty hours before the test, all food was removed from 
the home cage. On the day of the test, each mouse was placed in a large, clean tub 
cage (46 x 23.5 x 20cm (width, length, height)), containing paper chip bedding 
(3-cm deep), and allowed to explore for 5 min. The animal was removed from the 
cage, and one Froot Loop was buried in the cage bedding. The animal was then 
returned to the cage and given fifteen minutes to locate the buried food. Measures 
were taken of latency to find the food reward. As shown in Supplementary 
Table 11, there were no significant differences between the groups in latency to 
find the buried food. 

Hotplate test for thermal sensitivity. Individual mice were placed in a tall plastic 
cylinder located on a hotplate, with a surface heated to 55°C (IITC Life Science). 
Reactions to the heated surface, including hindpaw lick, vocalization or jumping, 
led to immediate removal from the hotplate. Measures were taken of latency to 
respond. The maximum test length was 30 s, to avoid paw damage. A two-way 
ANOVA indicated a significant main effect of sex (F(1,1) = 8.83, P=0.0053), and 
genotype x sex interaction (F(1,35) =4.3, P= 0.0455) (Supplementary Table 11). 
Post-hoc comparisons revealed that the male GPR68-knockout mice had signifi- 
cantly lower latencies to respond than female knockout mice. 

Acoustic startle method. The acoustic startle test can be used to assess auditory 
function and sensorimotor gating. The test is based on the measurement of the 
reflexive whole-body flinch, or startle response, that follows exposure to a sudden 
noise. Mice can be evaluated for levels of startle magnitude and prepulse inhibition, 
which occurs when a weak prestimulus leads to a reduced startle in response to 
a subsequent louder noise. For this study, animals were tested with a San Diego 
Instruments SR-Lab system. In brief, mice were placed in a small Plexiglas cylin- 
der within a larger, sound-attenuating chamber. The cylinder was seated upon a 
piezoelectric transducer, which allowed vibrations to be quantified and displayed 
on a computer. The chamber included a house light, fan, and a loudspeaker for the 
acoustic stimuli. Background sound levels (70 dB) and calibration of the acoustic 
stimuli were confirmed with a digital sound level meter (San Diego Instruments). 
Each session consisted of 42 trials, which began with a 5-min habituation period. 
There were seven different types of trials: the no-stimulus trials, trials with the 
acoustic startle stimulus (40 ms; 120 dB) alone, and trials in which a prepulse 
stimulus (20 ms; 74, 78, 82, 86 or 90 dB) occurred 100 ms before the onset of the 
startle stimulus. Measures were taken of the startle amplitude for each trial across 
a 65-ms sampling window, and an overall analysis was performed for each sub- 
ject’s data for levels of prepulse inhibition at each prepulse sound level (calculated 
as 100—(response amplitude for prepulse stimulus and startle stimulus together/ 
response amplitude for startle stimulus alone) x 100). 

Results from acoustic startle test. The GPR68-knockout mice had decreased 
startle responses after presentation of acoustic stimuli, in comparison to the 
wild-type mice (Extended Data Fig. 1le, f). A repeated-measures ANOVA, 
conducted on startle response amplitudes, indicated significant main effects of 
genotype (F(1,35) =7.22, P=0.011) and sex (F(1,35) = 16.61, P=0.0003), and a 
genotype x decibel level interaction (F(6,210) =5.77, P< 0.0001). Separate com- 
parisons confirmed that both male and female knockout mice showed signifi- 
cant reductions in startle responses (genotype x decibel level interaction, males, 
F(6,84) = 2.57, P=0.0245; and females, F(6,126) = 3.48, P= 0.0032). The decreased 
startle responses and overt sex differences were not associated with changes in 
prepulse inhibition (Extended Data Fig. 11g, h). The significant main effects of 
genotype on startle were no longer evident during a second acoustic startle test, 
conducted when mice were 16-17 weeks in age. 

Morris water maze, visible platform test. The Morris water maze task was used to 
assess spatial learning and visual function in the mice. The water maze consisted 
of a large circular pool (diameter = 122 cm) partially filled with water (45 cm deep, 


ARTICLE 


24-26 °C), located in a room with numerous visual cues. Mice were first tested 
using a visible platform. In this case, each animal was given four trials per day, 
across 2 days, to swim to an escape platform cued by a patterned cylinder extending 
above the surface of the water. For each trial, the mouse was placed in the pool at 
one of four possible locations (randomly ordered), and then given 60s to find the 
visible platform. If the mouse found the platform, the trial ended, and the animal 
was allowed to remain 10s on the platform before the next trial began. If the plat- 
form was not found, the mouse was placed on the platform for 10s, and then given 
the next trial. Measures were taken of latency to find the platform via an automated 
tracking system (Noldus Ethovision). As shown in Supplementary Table 12, all 
groups of mice demonstrated a high degree of proficiency in the visual cue task. 
Acquisition and reversal learning in the hidden platform test (Extended 
Data Fig. 11i-I). Three days after the visual cue task, mice were tested for their 
ability to find a submerged, hidden escape platform (diameter = 12 cm). As in the 
procedure for visual cue learning, each animal was given four trials per day, with 
1-min per trial, to swim to the hidden platform. The criterion for learning was an 
average latency of 15s or less to locate the platform on 1 day. Mice were tested until 
the criterion was reached, with a maximum of 9 days of testing. When criterion 
was reached, mice were given a 1-min probe trial in the pool with the platform 
removed. In this case, selective quadrant search was evaluated by measuring num- 
ber of crosses over the location where the platform (the target) had been placed 
during training, and the corresponding areas in the other three quadrants. After the 
acquisition phase, mice were tested for reversal learning, using the same procedure 
as described above. In this phase, the hidden platform was located in a different 
quadrant in the pool, diagonal to its previous location. As before, measures were 
taken of latency to find the platform. On the day that the criterion for learning was 
met, the platform was removed from the pool, and the group was given a probe 
trial to evaluate reversal learning. 

For the above behavioural profiling studies, subjects were 21 wild-type mice 
(9 males and 12 females) and 18 GPR68-knockout mice (7 males and 11 females), 
on a C57BL/6 background. Sample sizes were not statistically predetermined. 
Testing began when animals were 6-7 weeks of age. For each procedure, measures 
were taken by an observer blinded to mouse genotype (wild type or knockout) 
and no animals were excluded from analysis. Data were analysed using one-way 
or repeated-measures ANOVA. Fisher’s protected least-significant difference tests 
were used for comparing group means only when a significant F value was deter- 
mined. Within-group comparisons were conducted to determine side preference in 
the social behaviour tests. For all comparisons, significance was pre-set at P< 0.05. 
Effect of ogerin and its analogue ZINC32547799 on learning and memory. 
Contextual and cue-dependent learning and memory were evaluated using a 
Near-Infrared Video Fear Conditioning system (MED Associates). Test cham- 
bers (29 x 25 x 25cm) had transparent walls and metal rod floors, and were 
enclosed in sound-attenuating boxes. The conditioned fear procedure had three 
phases: training, a test for contextual learning, and a test for cue-dependent learn- 
ing. Before each phase, mice were moved to a holding room adjacent to the test 
room and acclimated for at least 30 min. In the 8-min training phase, mice receive 
three pairings of a 30-s, 90-dB, 5-kHz tone (the conditioned stimulus) and a 2-s, 
0.6-mA foot shock (the unconditioned stimulus), in which the shock was presented 
during the last 2s of the tone. Context-dependent learning was evaluated 24h 
after the training phase. Mice were placed back into the original test chamber, and 
levels of freezing (immobility) were determined across a 5-min session, without 
the presence of the conditioned or unconditioned stimulus. Forty-eight hours after 
the training phase, mice were evaluated for associative learning to the auditory 
cue (the conditioned stimulus) in a final 6-min session. The conditioning cham- 
bers were modified using a Plexiglas insert to change the wall and floor surface, 
and a novel odour (vanilla flavouring) was added to the sound-attenuating box. 
Baseline behaviour was scored for 2 min, and then three 30-s conditioned stimulus 
tones were presented across a 4-min period. Levels of freezing were automatically 
measured by the image tracking software (Med Associates). Freezing was defined 
as no movement (below the movement threshold) for 0.5 s. To evaluate the effect 
of drug, strain-matched group of animals were given ogerin (10 mg kg in 10% 
Tween 80 or saline) 30 min before the training. 

For the learning and memory studies, sample sizes (number of animals) were 
not predetermined by a statistical method, and minimum of six male animals (age 
of 6-8 weeks) were used in each group (exact number of animals was specified in 
figure legends). Animals were assigned to groups randomly and experiments were 
not blinded to investigators. No animals were excluded from analysis. Statistical 
analyses were performed after first assessing the normality of distributions of data 
sets. Comparisons between groups were made using unpaired t-tests. Welch’s 
corrections were used when variances between groups were unequal. Comparisons 
between groups during conditioning, contextual and cued memory tests were 
assessed using two-way ANOVA with P < 0.05 being considered significant. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


. Eswar, N. et al. Comparative protein structure modeling using MODELLER. 


Curr. Protoc. Protein Sci. Chapter 2, Unit 2.9 (2001). 


. Yang, Q. & Sharp, K. A. Building alternate protein structures using the elastic 


network model. Proteins 74, 682-700 (2009). 


. Jacobson, M. P,, Friesner, R. A., Xiang, Z. & Honig, B. On the role of the crystal 


environment in determining protein side-chain conformations. J. Mol. Biol. 
320, 597-608 (2002). 


. Li, J., Zhu, T., Cramer, C. J. & Truhlar, D. G. New class IV charge model for 


extracting accurate partial charges from wave functions. J. Phys. Chem. A 
102, 1820-1831 (1998). 


. Chambers, C. C., Hawkins, G. D., Cramer, C. J. & Truhlar, D. G. Model for 


aqueous solvation based on class IV atomic charges and first solvation shell 
effects. J. Phys. Chem. 100, 16385-16398 (1996). 


. Hert, J., Keiser, M. J., Irwin, J. J., Oprea, T. |. & Shoichet, B. K. Quantifying the 


relationships among drug classes. J. Chem. Inf. Model. 48, 755-765 (2008). 


. Gaulton, A. et a/. ChREMBL: a large-scale bioactivity database for drug discovery. 


Nucleic Acids Res. 40, D1100-D1107 (2012). 


. Mumberg, D., Muller, R. & Funk, M. Yeast vectors for the controlled expression 


of heterologous proteins in different genetic backgrounds. Gene 156, 
119-122 (1995). 


53. 


54. 


55. 


56. 


5/. 
58. 


59. 


Erlenbach, |. et a/. Functional expression of M,, M3 and Ms 
muscarinic acetylcholine receptors in yeast. J. Neurochem. 77, 
1327-1337 (2001). 
Armbruster, B. N., Li, X., Pausch, M. H., Herlitze, S. & Roth, B. L. Evolving 
the lock to fit the key to create a family of G protein-coupled receptors 
potently activated by an inert ligand. Proc. Nat! Acad. Sci. USA 104, 
5163-5168 (2007). 
Christopoulos, A. & Kenakin, T. G protein-coupled receptor allosterism and 
complexing. Pharmacol. Rev. 54, 323-374 (2002). 

Besnard, J. et al. Automated design of ligands to polypharmacological profiles. 
Nature 492, 215-220 (2012). 

Keiser, M. J. et al. Predicting new molecular targets for known drugs. Nature 
462, 175-181 (2009). 

Horvat, S. J. et al. A-kinase anchoring proteins regulate 
compartmentalized cAMP signaling in airway smooth muscle. FASEB J. 
26, 3670-3679 (2012). 

Huang, X.-P., Mangano, T., Hufeisen, S., Setola, V. & Roth, B. L. Identification of 
human ether-a-go-go related gene modulators by three screening platforms 
in an academic drug-discovery setting. Assay Drug Dev. Technol. 8, 

727-742 (2010). 


© 2015 Macmillan Publishers Limited. All rights reserved 


= 1.0 8y-@ ZnCl, -GPR39s 4 
7) © Formate a % JO ZnCl2 - p426s oe 
+i Acetate & 5} ZnSO4 - GPR39s 
a @ Propionate 5 4/@ ZnSO4 - p426s 
g 0.5} e Butyrate ies 
: 2 2 
6 0.0 a 
J -4 20) - - -4 -2 
— Log [SCFA] Log [Drug] 
= 15)-@ C - 
1.5 r(Ill) - GPR39s 
iW >] © Formate b B fo Crilll) - p426s f 
4 4 | & Acetate 8140]* Cr(ll) - GPR39s 
6 | @ Propionate = © Crill)-p426s 
g  Butyrate 7 
£05 6° 
S WL 
B00 oF 3 
3 -4 -2 -8 8 -2 
_ Log [SCFA] Log [Drug] 
2 1.2 67 CdSO, - GPR39s 
oO Formate Cc w 4}o CdSO, - p426s g 
+ 9 | @ Acetate ® 4)* CdCl, - GPR39s 
S ~~ | ® Propionate 2 "|e CdCl, - p426ss 
a © Butyrate bs 
£04 - 2 
Ss Ww 
oO 
6 0.0 0 
-4 -2 - iS . -6 4 
_ Log [SCFA] og [Drug] 
= 0.8 _ 107 FeSO, GPR39s 
7) * Formate d % |o FeSO, p426s h 
pu = Sie , 8 | FeCl, GPR39s 
= ropionate e - 
2 0.44  Butyrate ® 510 FeCl, p426s 
= Ke) 
8 im 
6 0.0 0 
i -4 -2 - -6 -4 -2 
Log [SCFA] Log [Drug] 
A) pH6.5 40: 
@ pH 7.4 
9) S 30 
: 20: 5 
2 zB 20: 
3 
Z 104 Z 10 
0 a 0. 
9 8 7 6 5 4 
Log[Lorazepam] 
ad es pH6.5 oO ~ 
_ = pH7.4 x 
g 30 % 30 
20: 20: 
z z 


| 


9 -8 7 6 
Log[Clonazepam] 


Extended Data Figure 1 | Validation and confirmation of GPCR 
activation assays. a—o, Yeast (a~k) and HEK293T cell (I-o) GPCR 
activation assays. a-d, Concentration-dependent growth of GPR43- 
expressing G; yeast (a), GPR43-expressing G, yeast (b), GPR41-expressing 
G, yeast (c), and GPR41-expressing Gy yeast (d) in response to various 
short-chain fatty acids (SCFAs). e-h, Concentration-dependent 

growth of GPR39-expressing G, yeast (GPR39,) in response to zinc 

ions (e), chromium ions (f), cadmium ions (g) and iron ions (h). 

i-k, Concentration-dependent cAMP responses of GPR39-expressing 
HEK293T cells to ZnCl, (i), ZnSO4 (j) or CdSOy, (k) as measured by 


4 


ARTICLE 


2.5 , 
a * GPR39 I 
B20 ‘© Untransfected 


O1.5 


= 
121.0 
0.5 
-10 -8 -6 -4 -2 
Log [Drug] 
2.5 : 
rot * GPR39 J 
B20 ‘© Untransfected se 
O1.5 
3 
121.0 3 
0.5 
-10 -8 -6 -4 - 
Log [Drug] 
2.5 
3 * GPR39 k 


®2.01 © Untransfected 
no} 


O1.5 
3 
121.0 
0.5 
-10 -8 -6 -4 - 
Log [Drug] 
6.0: I 
s0 RE 
@ HEK293 T, pH 7.4 


LU (fold of basal) 


: | a 


SESESC ESSE SEL SESE 
SES sath sets stats 
SEMA PISS oe Shas 
AWN SEES 
< 
we 
¥ 


* pH6.5 n 

@® pH 7.4 
a 
9 8 7 6 5 4 


Log[Desmethyldiazepam] 


= pH74 P 
a 3 
9 8 7 6 5 4 


Log[Norfludiazepam] 


luciferase cAMP reporter assay. 1, N-unsubstituted benzodiazepines 
(lorazepam, clonazepam, desmethyldiazepam and norfludiazepam; 10 uM) 
stimulated cAMP production in a GPR68- and pH-dependent manner. 
Data are mean +s.e.m. (n = 3-66 measurements). m-p, Concentration— 
response curves of N-unsubstituted benzodiazepines lorazepam (m), 
desmethyldiazepam (n), clonazepam (0) and norfludiazepam (p) at 

pH 6.50 or 7.40 in GPR68-transfected HEK293T cells (structures in 
Supplementary Table 1). Normalized results represent mean + s.e.m. 
(n=3) and curves were analysed in GraphPad Prism using the built-in 

4 parameter logistic function. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 

15 Lorazepam (M) 20 Ogerin (M) 
za || eh = | #0 
g = 3E-7 215, * 3E-7 
S40] * 1E-6 a = 1E 
eo | + 3e So + ae 
s | =e sio 7 IE 
3 + 3E 2 + 36. 
£5 = 
2 
z z° 


40 


a 
i=) 


30 Ea 
5 5 30 
B20 3 
2 £20 
z10 2 40 
0 0. 
9.0 -85 -80 -7.5 -70 -65 -6.0 9.0 -85 -8.0 -75 -7.0 65 -6.0 
Log[H*] Log[H*] 
é joe RVO@;_ in 
E 173 Y 
* M P 
NOATING D ING 
Ss E Q E Cc 
I im WwW D 
a at oO} Q R> A 
ii 101 189 Vv 248 
] OY «eu ® ge ELE Dy Ong ; 
@ + Y Sq ly Vv @ Von Helix vil 
Q ib Q L Cc Vi L N Y @ i>) lF A 
BAY FMP) \C,@ fy pPoOLIER  @ @ 2 
Vev gt L s Vv A S 
Y y, if LD GE L F L 
AL OG! E L eo. 


GPR68_HUMAN 
GPR4_HUMAN 
GPRES_AUMAN 
GPR132_HUMAN 
CXCR4_HUMAN 


GPR68_HUMAN 
GPR4_HUMAN 
GPRE65_AUMAN 
GPR132_HUMAN, 
CXCR4_HUMAN 


GPR6S_HUMAN 
GPR4_HUMAN 
GPR6S_HUMAN Ls 
GPR132_HUMAN Tuy at 
cXCR4_HUMAN x 


GPR68_HUMAN 
GPR4_HUMAN 
GPRES_AUMAN 
GPR132_HUMAN 
CXCR4_HUMAN 


LvTd 


ee NK, 


GPR68_HUMAN 
GPR4_HUMAN 
GPR6S_HUMAN 
GPRI22_HUMAN 
CXCR4_AUMAN 


Extended Data Figure 2 | Lorazepam and ogerin have minimal GPR4 
or GPR65 activity. a—d, Effect of lorazepam (a, b) or ogerin (c, d) on 
GPR4 (a, c) or GPR65 (b, d); data represent normalized mean +s.e.m. 
(n= 3). Sequencing alignment proton-sensing receptor and docking poses 
for ogerin and its analogues. e, GPR68 snake plot showing extracellular 
loops and transmembrane domains (upper portion); important residues 
are highlighted. Glu160, Arg189 and His269 were mutated in this study. 

f, Sequence alignment of GPR4, GPR65 and GPR68 to CXCR4 (PDB code 
30DU) (PROMALS-3D) was manually refined to reduce gaps and to 
position conserved residues. TM, transmembrane regions; IL, intracellular 
loop; EL, extracellular loop. Conserved residues highlighted in blue by 
degree of conservation while red boxes indicate residues important for 


KIKALAL 


LIAIVLY 


Tyr268 


receptor function. Red stars indicate residues mutated in this study. 

g, Sampling different regions for lorazepam binding modes in GPR68. 
Yellow and grey surfaces contour the binding site of 1T1t and CVX15 in 
CXCR4 crystal structures (PDB codes 30DU and 3OE0, respectively), 
while green and red surfaces sample the entire binding pocket. The 
magenta surface represents the canonical orthosteric biogenic amine 

site. h, ZINC32547799 in its predicted orientation and interactions with 
GPR68. i, Optimization of ogerin (magenta, thin lines) to C2 (brown, 
structure in Fig. 3a) by insertion of a single methylene is predicted to 
improve packing in the aryl pocket of the ogerin site. Adding a second 
methylene, thus creating a propyl linker in C3 (yellow, structure in Fig. 3a), 
is predicted to disrupt the packing and thus to reduce the allosteric effect. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Lorazepam 
4929116 


I) 32587282 
20855260 


4928902 

32547799 
67740571 
20869006 
32590454 


32503371 


pKi Color 
25 
5 - 5.99 


6-6.99 
5-HTzA 7-799 
5-HT28 at) 8 - 8.99 
5-HT2c ND 
5-HT3 
5-HTs i 
5-HT5A 
5-HTé 
S-HT7A ia 
ADA Bi hm mm 
OA 
cin 


a 32520276 


S-HT1A 
5-HTip 
5-HT10 
5-HT1eE 


AMPA, 
fia 
pe 


Bs 
BZP (Rat Bran) Ei 
Ca’ Channel 
co jaa [el 
CB2 
a] 


HERG Channel 
KA 

KOR 

Mi 


Extended Data Figure 3 | Heat map of off-target activities of lead tested in a hERG functional assay as previously published*’. AMPA, 
compounds at potential CNS drug targets. Radioligand binding assays aminomethylphosphonic acid receptor; BZP, benzodiazepine receptor; 
were carried out by the National Institute of Mental Health Psychoactive DAT, dopamine transporter; DOR, delta (5) opioid receptor; KA, kainate 
Drug Screening Program (NIMH PDSP) as described previously***” acid receptor; KOR, kappa (x) opioid receptor; MOR, mu (1) opioid 
(online protocols available at http://pdsp.med.unc.edu/pdspw/binding. receptor; NAT, noradrenaline transporter; NMDA, N-methyl-p-aspartate 
php). Values represent mean binding affinities (pK;, n = 2-4). Affinities receptor; ND, not determined; PBR, peripheral benzodiazepine binding 
lower than a pK; of 5, or less than 50% inhibition at 10 uM, are shown site; SERT, serotonin transporter; hERG, human ether-a-go-go-related 

as a minimum of 5 on the pKj scale. The hERG inhibition activity was gene (potassium channel Kv11.1). 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


G, pathway (cAMP production) 


“# GPR4 
© GPR65 
+ GPR68 
= E160A 
 E160K 
~*~ E160Q 
~~ H269F 
@ R189L 
& R189M 


A 
oO 


wo 
oO 


RLU (fold of basal) 
Lye) 
Oo 


G, pathway (Calcium release) 


* GPR68 
= E160A 
 E160K 
~* E160Q 
+ H269F 
| @ R189L 
& R189M 


— 
2 


RFU (fold of basal) 
iD 


DOs cree Ce BB Bee 
Oise EG OIRO. eect 
T T T T T r 7 T - T + T + 7 + 1 ~ Y + 7 
-9 -8 -7 -6 -5 -8.5 -8.0 -7.5 -70 -65 -6.0 -5.5 -5.0 
Log[H*] Log[H*] 
c d 
Rae. oe & EIS S 2.0 
DEE ee ae : GPR68 
~ nT = E160A 
5 ner ‘a 1.8) MH E160K 
= % a 8 E160Q 
D190, 4 R189L 
a Ss = a rest 
[o) | ao) H269F 
s a oO 214) HEK293 T 
£0.54 > 2.04 re 1.2 
3 x 
g 1.04 ee ee Le 1.0 wil 
ro 
20.0 -9 -8 -7 -6 5 © Vv PO SL,D 3 © 
oc Wh SP MAS SAO Ce oP? 
FF SSS SS Logi] Bis 85 Sop SSF 
FPP POG gO" SY? BS ‘3 
Ce Sf PY Ke Lg b aay Cees P as 
f j 
G, pathway (cAMP production) G, pathway (cAMP production) 
150 150 
= | -- GPR68 -- GPR68 § 2.0 
8100] 4 R189L 400] & H269F 8 
S & R189L + 32547799 @ H269F + 32547799 S 
3 * R189L + Ogerin @ H269F + Ogerin 015 
& fe) 
504 = 
z oY LU 5 = pH 8.4 buffer 
ao Pe 510 = ZINC32547799 
0 cineetee no Aaa ee  Ogerin 
85 -80 -75 -70 65 -60 -55 85 80-75-70 -65 60-55 -5.0 - 
ine 8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -45 85 80 75 70 65 60 
h = i LogfHt) k Log[H"] 
G, pathway (Calcium release) G, pathway (Calcium release) 
si ol cudeea =") © ZINC32547799 
Ls) . 
1.8] -- GPR68 1.8) & H269F 2 * Ogerin 
8 é& R189L @ H269F + 32547799 iS 
5 1.6) te R189L + 32547799 1.6} a H269F + Ogerin $15) =» DMSO 
3 tw R189L + Ogerin co 
£14 1.4 2 
D> | (Gr BS S27 => 
@ 1.2 12 = 
01.0; 
1.0 1.0 


75 70 -65 -60 


Log[H*] 


-8.5 -8.0 -5.5 -5.0 


Log[H"] 


Extended Data Figure 4 | Confirmation of modelling results via 
mutagenesis. a, b, Protons showed agonist activity at GPR68 wild-type 
and mutant receptors in cAMP production (a) and calcium release (b); 
parameters are in Supplementary Table 4. c, Relative GPR68 wild-type 
and mutant receptor expression levels determined by anti-Flag 
immunoblotting (n = 3). d, Proton-mediated cAMP production in 
untransfected cells (n = 16). e, Calcium release by lorazepam and 
selected ZINC compounds (10 uM at pH 8.0, n = 6-22 measurements). 


85 80 75 -70 65 


6.0 -5.5 -5.0 


Log[Drug] 


f-j, Effect of ogerin and ZINC32547799 (10M) on proton-mediated 
cAMP production (f and g, n= 4), calcium release (h and i, n =3), and 
phosphatidylinositol hydrolysis (j, n = 3) at GPR68 wild-type or mutant- 
transfected HEK293T cells. k, Effect of ogerin and ZINC32547799 on 
phosphatidylinositol hydrolysis at pH 8.4 at GPR68-transfected GPR68 
HEK293T cells (n = 3). Normalized results represent mean + s.e.m. and 
curves were analysed using a four-parameter logistic function. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


d _ HEK293T, individually normalized f HEK293 T, Normalized to pH 9.5 
a 40 40 
: GPR6S = | =pH7a0 F: 
4 E160K 30) PAPE 9:50 § 0 
: PEt : 5 
53 i R189M 3 20 20) 
a “i i H269F = £ 
£2} > 
> z 10 > 10 
x4 ac 
TULL | c 
. : -12 -10 -8 6 4 -12 -10 -8 6 -4 
pH 7.4 pH 8.0 pH 9.0 Log[ISO] Log[ISO] 
PH conditions 
b Ogerin at GPR68 under different pH e GPR68, individually normalized g GPR68, Normalized to pH 9.5 
120 40 40 
* pH 6.50 * pH 6.50 
= => ® pH 7.40 = = pH 7.40 
8 ® 30, * pH 9.50 § 30, * pH9.50 
8 80 § 8s 
xe) So x) 
a] D 20 3B 20 
2 2 & 
> 40 =) = 
z z z 10 
0 ) 
-12 -10 -8 6 4 242 -10 -8 6 -4 -12 -10 8 6 4 
Log[Ogerin] Log[ISO] Log[ISO] 
C Ogerin at GPR68 mutants under pH 9.0 h i 
8) © GPR6S 4.0, * CGS15943 + 30 nM NECA aa 
= re See + ZINC 67740571 + 30 nM NECA -t ZINC 67740571 + 1 nM 5-HT 
36) » E1600 = ~ ZINC 67740571 =50, = ZINC 67740571 
s + H269F 23.0) 8 CGS15943 % © 5-HT 
6 © R189L = * NECA = 
2 4} 0 R189M 5 5 4.0 
2£ 3 2.0 xe] 
> Z = £30 
i” = 
- rate Z 20 
0 
+ 0.0: 1.0 Beeeeee 
Basal-7.0 -6.5 -6.0 -5.5 -5.0 -4.5 ae Ae aE TE Be 
j “12-11-10 -9 -8 -7 6 -5 -4 “12-11-10 -9 8 -7 6 -5 -4 
Log[Ogerin] Loaioraal Log[Drug] 
J 
2.04 
k 
= 25 Antagonist activity: In the presence of 50 nM CXCL12 
21.5} 
a % 2.0 
6 8 
31.0] 6 1.5 
&£ zB 
D 21.0 
70.5} =| 
0.0 0.0 
a VE SESS x SS ES SF EF PHL pS 
Fr SS Se if ie igs F FF SF SE eS? 
Xo) oe 4? & S Pg pe ad °° OP PM SF PS S 
a oS oO oS S MS 
se SS § 
ORS) S SS aS z 
ro we ¥ 
& 
Extended Data Figure 5 | Control experiments for signalling and (K; of 220 nM) of ogerin at Az, (CAMP production, h) and weak antagonist 


pharmacology. a, Basal cAMP production of GPR68 wild-type and 
mutant receptors (mean + s.e.m., m = 24-46 measurements). 

b, pH-dependent activity of ogerin at GPR68 wild type (mean + s.e.m., n =3). 
c, Ogerin concentration—-responses at GPR68 wild-type and mutant 
receptors at pH 9.0 (c, mean +s.e.m., 1 =3), under which cAMP reporter 
assay was not affected (d-f). d-g, Proton modulated isoproterenol- 
mediated G,-activation via §-adrenergic receptors in untransfected (d, f) 
and GPR68-transfected (e, g) cells. Normalized results (basal at 

pH 9.5 for d and e; or corresponding buffer control for f and g) represent 
mean + s.e.m. (n= 6). h, i, Inverse agonist and antagonist activity 


mean +s.d. (n= 2). 


© 2015 Macmillan Publishers Limited. All rights reserved 


activity (K;, of 736 nM) at 5-HT, receptors (calcium mobilization, i). 
5!-N-ethylcarboxamidoadenosine (NECA) and 2-chloro-N°- 
cyclopentyladenosine (CCPA) served as agonist controls, while CGS15943 
is an inverse agonist control for Az, receptors. Normalized results 
represent mean + s.e.m. (1 = 3). Curves were analysed in GraphPad Prism 
with the built-in four-parameter logistic function. j, k, Lead compounds 
(10 uM) showed no agonist (j) or antagonist (k) activity at CKCR4 
receptors (cAMP production) with CXCL12 as an agonist control (1 or 

3 uM) or AMD 3100 (10 uM) as an antagonist control. Results represent 


ARTICLE 


= 


GPR68 at pH 8.4 


1.5 


DDO By Pe ee e IS) Ro 


RLU i of aeeel 


PPLE ES: Pg 
b 
25 GPR68 at pH 7.9 
= 20 
wo 
oO 
= 15 
So 
2 
£ 10 
=) 
= | 
os 
D owalewenwcllu 
% O V6 SOA OB @MX 0 
PPM GG PP PSO POD ES 
PP EPP LSS SPS oS 
ABP” AD” GPP? GP oD PP Pres PP HP >of oS 
6 
= GPR68 at pH 7.4 
= 20 
Ww 
Ss 
“5 15 
z 
2 10 
= 
| 
; ll j 
ool Seeenene Be 
SoA BOK &O 
22,8 PP OPO PCOS 8 
SESS PESESELES SE 
POPES SPP SP SP GP OS 


Extended Data Figure 6 | Primary screening and comparison of 
allosteric parameters of 13 ogerin analogues at GPR68. The 13 ogerin 
analogues (structures in Supplementary Table 9) identified from docking 
a virtual library of more than 600 ogerin derivatives were synthesized 
(Supplementary Information). a—e, Production of cAMP was measured in 
transiently transfected HEK293T cells at 10 1M and five different 

pH conditions, pH 8.4 (a); pH 7.9 (b); pH 7.4 (c); pH 7.0 (d); and pH 6.5 (e), 


d 
10) GPR68 at pH 7.0 
= 8 
n J 
3 
‘5 5] 
xe] q 
£ 4 
=) 
S| 
ah TTTTT F 
04 


DDO. SLOP MPS OPO POS SL 
PEEP PPL’ SS PS SOD Se 


APS GS AP AP OP OP" 057 SS oP OS 
e 
10) GPR68 at pH 6.5 
| 
ono 
8 
5 & 
z 
£ 4 
Sof 
al | 
ull. Lilian | 
f Hoe 


Ne) Vy 2 % DS XD o 
PP MPM GPE OG SOS 5 
Pn OO? nO PPO” Poo? & 
PPS oP S089? Soh? eS S069 SSS 


Log(a) and Log(p) 
© 


SOD Oe FS eN 
@) aye Ss 
o>? fe 


to reveal any pH-dependent potentiation activity. Normalized results 


represent mean + s.e.m. (# = 8-16 measurements). f, Graphic comparison 
of the allosteric parameters loga and log. Proton concentration-responses 
were carried out in the absence and presence of increasing concentrations 


of ogerin and its analogues, results were analysed using a standard 
allosteric operational model to obtain allosteric parameters. Values 
represent mean + s.e.m. (# > 3; see details in Supplementary Table 8). 


© 2015 Macmillan Publishers Limited. All rights reserved 


a GPR68, CGH 2466 
100. 
+0 
= 3E-8 
_ 80] ~ 1E-7 
ro} — 3E-7 
® + 1E-6 
2 60, @ 3E6 
o 
xe] 
= 40 
= 
a 
~ 
20: 
0 
-8.5 -8.0 7.5 -7.0 6.5 -6.0 -5.5 
Log[H*] 
b GPR68, Tracazolate 
100 
_~ 80 
a 
an 
is] 
2 60. 
oO 
2 
[e) 
= 40 
=] 
pT 
a 
20. 
0 
-8.5 -8.0 7.5 -7.0 6.5 -6.0 -5.5 
Log[H*] 
Cc 


GPR68, SLV 320 


ARTICLE 


~ 


CGH 2466 GPR68 + 30 uM Ro 20-1724 
60 


40 


20 


RLU (fold of basal) 


8.5 -8.0 -7.5 6.5 -6.0 -5.5 


-7.0 
Log[H*] 


SLV320 GPR68 + 30 uM Ro 20-1724 


60 
+0 
= 3E-7 
= + 1E-6 
oO 
D > 3E-6 
$4) tes 
i © 3E-5 
2 
£ 
> 20 
ao 
0 
8.5 -8.0 -7.5 7.0 -6.5 -6.0 -5.5 
Log[H*] 
Cc 
Tracazolate GPR68 + 30 uM Ro 20-1724 
60 
~- 0 
-= 3E-7 


is 
‘s} 


RLU (fold of basal) 
nN 
oO 
4+ 
woo 
mm 
ag 


Log(H'] 
4 SLV320 
As, 
“ cgH2a66 
SQN 
Adenosine 
int 
i GPCRs 0 
4 OO) me, 
‘, LN. ie Adenosine a “typ OH 
> Q 
Diazepam ZINC13642900 


Extended Data Figure 7 | Characterization of potent GPR68 PAMs. 

a-c, Concentration-response curves of H* in the absence and presence 

of increasing concentrations of CGH2466 (a, a’), tracazolate (b, b’) and 
SLV320 (c, c’) and in the absence (left column, a, b, c) and presence (right 
column, a’, b’, c’) of phosphodiesterase inhibitor (Ro 20-1724, 30 uM) 

at GPR68-expressing cells. Normalized results (mean + s.e.m., n= 8 for 
CGH2466; n=5 for tracazolate; n = 5 for SLV320 for left column and n=3 
for right column) were analysed using a four-parameter logistic function 
and the standard allosteric operational model (not shown). Allosteric 


parameters in absence of Ro 20-1724 are summarized in Supplementary 
Table 8. For each pair of fittings, the proton potency value (negative 
logarithm of the half-maximum effective concentration (pECs9)) from 

the agonist concentration-response curve (right) in the absence of testing 
compound was used as the pK, for the allosteric operational model (left). 
d, Schematic showing the shared pharmacology among GABAa, adenosine 
GPCRs and GPR68 ligands. Molecules along each edge of the triangle have 
been shown to have activity at both targets, whereas tracazolate, in the 
middle, shows activity at all three. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a e b Genotype Context 
E 150, 
s 
75 em 2 1004 
> 
HA 5 £ 
oe B 50} 
3 
* 
oO 
50 —_— ie 0 
- — -—— ae ie oe oe Oe EE 
VASP ec ———_——— Oo RY yO 
50 
p-p42/p44 —— —— 
C Genotype Cue 
50 @ 150; 
B-actin ———— or ee ee ee £ 
a = 100 
kDa Ve Og Lo Ve Og Lo Ve Og Lo Ve Og Lo > 
a  __.. 'N 
o 
HApcDNA HA-GPR68 HApcDNA HA-GPR68 & 504 
pH 8.0 pH 7.4 3 
ae 
« 
x Ye 
d e f 
@ «£0. Cue 
100) WT + Vehicle eo a as _ 80 
ae © 
ae WT + 32547799 & 404 E 
< D o 60 
N c 4 
9 N 30] = 
© 60 ® 2 
w £ 40 
xe = 20} ® 
25 3 £ 
5 10; 3 20 
0 5 5 
0 1 2 3 6 . 32547799 Vehicl 7 
enicie - 
7 | 
US/CS 32547799 = Vehicle 
= 6) " i; | Cc 
o ontext ue 
100) WT + Vehicle E™ = 80 
*@ WT + 32547799 xe £ 
* ee 40 - 
27s = = 60 
N < ~— 
® N 30 D 
eo ® & 40 
x2 =~ 20 Sf 
* 25 § ao 
= ne} 
S 10 2 
= Oo 
i 8 0 0 : 
0 1 2 3 32547799 Vehicle 32547799 Vehicle 
US/CS 
Extended Data Figure 8 | GPR68 mouse biology. a, Ogerin (Og) and g-i, At 30 mg kg” !, ZINC32547799 enhanced wild-type learning (g, 
lorazepam (Lo) activate PKA and p42/p44 MAP kinase in HEK293 drug x time interaction, F(3 39) = 3.58, P=0.022; drug alone Fi; 39) = 1.19, 
cells stably expressing haemagglutinin (HA)-tagged GPR68 but not P=0.295; Bonferroni post-hoc test revealed a significant effect (P< 0.05) 
HApcDNA,; vehicle (Ve). b, c, GPR68 knockout (n =7) mice exhibited no at the third unconditioned/conditioned stimulus training point, two-way 
differences in contextual memory retrieval (b) or cued memory retrieval ANOVA), but had no effect at contextual and cue memory (h, i) (vehicle, 


(c) as compared to wild-type mice (n =8). d-f, At 10 mg kg”, the ogerin n=7; drug n=8). Male mice at age of 6-8 weeks were used in the test. 
isomer ZINC32547799 had no effect on learning (d) or contextual and cue — Normalized contextual memory retrieval (d) and cued memory retrieval 
memory (e, f) in wild-type mice (vehicle, n = 6; drug, n=7). (f) are presented in Fig. 4c, d. 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


19 
© 


a 


(jeseq $0 ploy) A 


=) 


te 


Or oS 
wb A 


1.0 
0.5 


seid 


GPR65, pH 8.4 


q 
tit 


0.0 


-ZEQOL8bZ 
rScOLOsel 
+9ScGess9 
rLL6ZSZ97% 
+6G69996z 
+ L69Z69E1 
resge6sr9 
rGO/PeSer 
rS6EZ8ecl 
+LcS6LS9€ 
-GSOOLLZS 
tPSCCLOLy 
+Ggeo0909 
+Z89CZLG9 
beScOLPpLZ 
+Z906L8ZL 
rSOL9S62 
+cvO89r6 
+GL66PLEL 
rOOSPELrL 
reSc6rcrl 
tVPSLGLO? 
+OOPPs9el 
r6LLO66Z 
+Z91-79S69 
rOLLLELbL 
rGzcogerl 
re6698Le7c 
+PGh08zes 
LPrOClLeLy 
rGZGrleeZ 
+969819c9 
bLSOL97eL 
LOOLSZSLZ 
+eGOclLeLr 
tPOCOSLZL 
+ L80r0c6 
LOZELL69 
rGEcOLL Lb 
LOLOCHe6 
+ZEC66991 
bpecrl gel 
reecrlesZ 
rrseZo/eL 
rSPrOlgeZ 
tala 
rosMid 


(jeseq JO ploy) NTH 


1.6 


HEK293 T control cells, pH 7.4 or 8.4 


4 
1.2 
1.0 
0.8 
0.6 
0.4 


WW 


GPR65, pH 7.4 


2€002812 
SEOLO8ZL 
9S¢S8SS9 
LL6LSc97¢ 
65699962 
L69C69EL 
€S9e6Sr9 
SOLVSSEV 
86ELB8EcZ 
LC86LS9E 
Se9022¢S 
VECELOLY 
S8E00909 
L89CZCLS9 
escolrid 
c90628LL 
9019862 

cvO89r6 

6L66VLE1 
OOSPELFL 
€Sc6rerl 
vrSLSL9OZ 
OOrrEgel 
6LL0662 

29179969 
QLLLELPL 
gceggerl 
€66981L27 
vSVO8C8S 
VrOTLELY 
SZovle8e2 
96981929 
LeOleccl 
99LSZ8LZ 
CSOCLELY 
VIEISLAL 
L80v0c6 

QLELLEO 

SEcOLL LL 
OL9zrZ6z 
LEC66991 
VECPLE8L 
CeChLESL 
veel9leL 
evvOlged 
68060814 
Osa 


(jeseq Jo ploy) NTY 


! 


all. 


Z€002812 
SeOLO8cL 
9S¢S8659 
LL6LS29¢ 
65699962 
L69Z69EL 
€89E6Sv9 
GOLPESer 
86ELB8Ec2 
LE86LS9E 
GE90L242S 
PEESLOLY 
S8€00909 
L89CC1L99 
escgsps2 
79062822 
9019862 
cvO89r6 
6L66rVLEL 
OOSPELTL 
eSc6rerl 
VESLSLO? 
OOvr89EL 
6120662 
29179969 
QLLZELVL 
GCCQ9ErL 
C6698LC7~ 
vSv0878s 
vrOClLeLy 
GLSVL882 
969849¢9 
Le0L8ec2 
99LSZBL2 
CSOCLELY 
PIEOSLAL 
L80r70¢6 
QLELL69 
GEcOLZLLL 
OL9Zr26~ 
ZEE66991 
VECPLB82 
cecveel 
7889/82 
evvOlgsd 
680604814 
Osa 


GPR65, pH 8.4 + BTBO9089 


(jeseq JO ploy) N1Y 


et aso 
ee 


oot 
oo 5 


were activated by BTB09089 (30 uM) at pH 8.40 for modulator or 


= 


Extended Data Figure 9 | Screening of ZINC compounds predicted to 


be active at GPR65 based on BTB09089 docking poses. a-d, Primary 


-LEOOL8bZ 
+SCOLOScl 
r9ScSesc9 
+ LL6LSZ97% 
+6S69996z 
+ L69Z69E1 
resocesr9 
+GO/PeSer 
rB6eZeecl 
rLcg6LS9€ 
rGSQ0LZeS 
LPSCCLOLy 
+G8e00909 
+L89ZZLS9 
reScOLpLZ 
+Z906L82L 
rSOL9862 
rZvO89r6 
+GL66PLEL 
rOOSPELrL 
reSc6rerl 
+PPSLGLO? 
+O0Prrs9El 
r6LLO66Z 
+Z91L-F9S69 
rOLLZELbL 
rGzcogerl 
re6698Le7c 
t7Sr08zes 
LppOZlLeLy 
rSZGrl eel 
+9698/9¢9 
bLSOL97eL 
LOOLSZBLZ 
+ZGOZlLoLr 
tPOCOSLZL 
+ L80r026 
tO/ELL69 
rGEcOLL Lb 
LOLOZHz6Z 
rZeC66991 
Lpecpl 982 
reecrlesZ 
rreeZoze 
rePrOl9eZ 
+68060ELd 
rosMid 


antagonist activity (d). Normalized results represent mean + s.e.m. 


from a minimum of three assays (each in minimum of triplicate and a 


total of > 16 measurements 


screening with ZINC compounds (30 uM) for agonist activity at GPR65 


). The red dashed line in b-d indicates the 20% 


inhibition line (an arbitrary cut-off line). 


when receptors were kept inactive at pH 8.40 (a); at control HEK293T cells 


for nonspecific activity (b); at GPR65 when receptors were activated at 


pH7.40 for modulator or antagonist activity (c); at GPR65 when receptors 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
60 


® Control 


50] * 10 uM BTBO9089 


iN 
o 


N 
{=} 


RLU (fold of basal) 
w 
[s) 


= 
[o} 


2.0 

Hi GPR65, pH 7.40 

Bi GPR65, pH 8.40 + ISO 
© GPR68, pH 6.50 


= 
a 


1.0 


RLU (fold of basal) 


0.0 


GP VL IV BH 2 rH Heh qh ol 
ISAS SI SSS MSS 
i OSI SOROS © 
Shen MPP PE VEGAS ny PaO 


O 


© 
» 


Cc 
2.0 


 GPR65, pH 7.40 
1.5| @ GPR85, pH 8.40 + ISO 
**| I GPR65, pH 8.40 + BTB 
I GPR68, pH 6.50 


RLU (fold of basal) 


0.5 


0.0 


% ye 
of rad 
S$ & 
& > 
ss A 


Extended Data Figure 10 | Characterization of GPR65 allosteric 
modulators at wild-type and mutant receptors. a, BTB09089 showed 
weak agonist activity, but failed to potentiate proton activity at GPR65 
(n= 8). b, Selected compounds from Extended Data Fig. 9b, c were 

tested for GPR65 specific inhibition (n = 16-56 measurements). Several 
compounds (such as ZINC41613384, ZINC9468042 and ZINC62678696) 
showed GPR65-specific inhibition. c, Selected compounds from Extended 
Data Fig. 9b, d were tested for antagonist activity against BTB09089- 
activated signal at GPR65 (mn = 16-64 measurements). ZINC62678696 
showed GPR65 specific inhibition when it was activated by either proton 


oa 


= 


@ GPR65 pH 7.40 

4 GPR65 pH 8.40 + BTB09089 
m= GPR65, pH 8.40 + ISO 

v_ HEK299 T, pH 8.40 + ISO 


& 
nn 


is 
Nn 

Li 

HH 


Relative Luminescence Unit (fold of basal) 
ro) 


oo etesadsgecceseacsare Wp-ssnenstesatnsccneetmssanenesse: 
a 
0.8 2 
06 e 
a 
0.4 
T t T 
Basal 6.0 -5.5 -5.0 4.5 
Log[62678696] 


nN 


nD 
2 


as 
va 


10) 


RLU (fold of basal) 


70 65 


Log[H"] 


80 75 


-- GPR65 
* D153A 
= R187L 
4 F242A 
vy Y272A 


RLU (fold of basal) 


9 8 7 6 4 
Log[BTB09089] 


sg 
°o 


RLU (fold of basal) 


1.0 


-9 -8 


7 4 
Log[13684400] 


6.0 


6.5 


or BTB09089. d, ZINC62678696 inhibited GPR65 activity. e-g, Proton 
concentration-responses (e), BTB09089 concentration-responses (f), 


and ZINC13684400 concentration-responses ( 


g) at GPR65 mutant 


receptors. Normalized results represent mean 4 


t s.e.m. (n > 3) and curves 


were analysed in GraphPad Prism with a standard four-parameter logistic 


function. Corresponding curves of proton at G 


PR65 wild-type receptors 


(from Extended Data Fig. 4a) and BTB09089 and ZINC13684400 (from 
Fig. 5e) are also included (dashed lines) for comparison. Pharmacological 


parameters are listed in Supplementary Table 1 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 


= 
o 
o 
rt 


Distance (cm) 


0 
5 10152025 3035 4045505560 


Time (min) 


Females 


1400 = wr 


1200 KO 


= 
ad ) 
So So 
oS 


Distance (cm) 
oa 
o 
i—I 


400 


200 


5 1015202530354045505560 


Time (min) 


Latency to Fall (sec) 


1 2 3 3 4 
Trialon Rotarod 


Females 


BWT 
3000 KO 


240 T a 


Latency to Fall (sec) 


1 2 3 i 4 
Trialon Rotarod 


Extended Data Figure 11 | In vivo behavioural profiling of GPR68- 
knockout mice. a, b, No effects of GPR68 deletion on distance travelled in 
an open field. Data represent mean + s.e.m. for each group for a one-hour 
test session. c, d, No difference on latency to fall from an accelerating 
rotarod. Data represent mean + s.e.m. for each group. e-h, Decreased 
startle responses in GPR68 knockout mice after presentation of acoustic 


= 


Startle Amplitude 


o 
o 
Co 


a 
o 
2. 


— 
NoS AS 74 78 82 86 90 
Prepulse Sound Level (dB) 


Females 


2000m wr 


Startle Amplitude 


Percent Inhibition 


Percent Inhibition 


O Ko 


a 
° 
o 


° 
° 
o 


509 


il i. 


ol— 
NoS AS 74 78 82 86 90 
Prepulse Sound Level (dB) 


60 


40) 


0 
74 78 82 86 90 


Prepulse Sound Level (dB) 


h 


Females 


109 mw wT 
oO ko 


80 
60 


40) 


0 
78 82 


74 86 90 
Prepulse Sound Level (dB) 


© 2015 Macmillan Publishers Limited. All rights reserved 


Latency to Platform (sec) 


Latency to Platform (sec) 
w 
S 


ARTICLE 


60 


Acquisition, Males 


50 


40 
30 
20 


10 


1 2 3 4 5 6 uA 8 
Day of Testing 


Acquisition, Females 


—*—_ WT 
—— KO 


20] 
10) 
0 
1 2s 4 85 68 7 8&8 
Day of Testing 
Reversal, Males 
60 —s—— wt 
o —o— KO 
g 50 
5 
& 
a 30 
£ 
B 20 
8 
S 10 
0 
1 2 3 4 5 6 7 
Day of Testing 
Reversal, Females 
60 


—*—_ wt 


Latency to Platform (sec) 
w 
° 


1 2 3 4 5 6 7 
Day of Testing 


stimuli (e, f). Data represent mean + s.e.m. for each group. No effects 

of genotype were found for levels of prepulse inhibition (g, h). Data 
represent mean +s.e.m. for each group (*P < 0.05). i-l, No difference at 
acquisition and reversal learning in the Morris water maze. Data represent 
mean + s.e.m. of four trials per day. Subject numbers were 9 wild-type and 
7 knockout male mice, and 12 wild-type and 11 knockout female mice. 


LETTER 


doi:10.1038/nature15747 


Extremely metal-poor stars from the cosmic dawn 
in the bulge of the Milky Way 


L. M. Howes!, A. R. Casey’, M. Asplund}, S. C. Keller!, D. Yong!, D. M. Nataf!, R. Poleski**, K. Lind®, C. Kobayashi®, 
C. I. Owen!, M. Ness’, M. S. Bessell!, G. S. Da Costa!, B. P. Schmidt!, P. Tisserand!®, A. Udalski*, M. K. Szymanski’, 
I. Soszyniski®, G. Pietrzynski*’, K. Ulaczyk*!°, L. Wyrzykowski’, P. Pietrukowicz*, J. Skowron®, S. Kozlowski? & P. Mréz? 


The first stars are predicted to have formed within 200 million 
years after the Big Bang’, initiating the cosmic dawn. A true first 
star has not yet been discovered, although stars?‘ with tiny amounts 
of elements heavier than helium (‘metals’) have been found in the 
outer regions (‘halo’) of the Milky Way. The first stars and their 
immediate successors should, however, preferentially be found today 
in the central regions (‘bulges’) of galaxies, because they formed in 
the largest over-densities that grew gravitationally with time>°. The 
Milky Way bulge underwent a rapid chemical enrichment during 
the first 1-2 billion years’, leading to a dearth of early, metal-poor 
stars®°. Here we report observations of extremely metal-poor stars 
in the Milky Way bulge, including one star with an iron abundance 
about 10,000 times lower than the solar value without noticeable 
carbon enhancement. We confirm that most of the metal-poor 
bulge stars are on tight orbits around the Galactic Centre, rather 
than being halo stars passing through the bulge, as expected for stars 
formed at redshifts greater than 15. Their chemical compositions 
are in general similar to typical halo stars of the same metallicity 
although intriguing differences exist, including lower abundances 
of carbon. 

Stars with a low content of heavy elements have distinct spectral 
flux distributions, which are reflected in their colours. Using the 
photometric filter system on the SkyMapper telescope operated by the 
Australian National University, it is possible to identify metal-poor 
candidate stars'® in the Galactic halo* and bulge’. We have observed 
~14,000 bulge stars preselected from SkyMapper photometry using the 
AAOmega spectrograph on the Anglo-Australian Telescope (AAT), 
which enables the acquisition of 400 simultaneous stellar spectra over 
a 2-degree field of view. More than 500 stars with an iron abundance 
less than 1/100th of the solar value have been identified, making our 
survey the first to successfully target metal-poor stars in the Milky Way 
bulge. Twenty-three of these stars, targeted as the most metal-poor 
ones on the basis of the intermediate resolution spectra (Extended Data 
Table 1), were observed in June 2014 with the MIKE high-resolution 
spectrograph on the 6.5-m Magellan Clay telescope'! to enable a com- 
prehensive determination of their chemical compositions (Fig. 1). 

The stars’ effective temperatures were derived through fitting the 
observed hydrogen lines with theoretical spectra, while neutral and 
ionized iron lines provided measurements of the surface gravities and 
metallicities in the framework of 1D stellar atmosphere models” and 
non-equilibrium spectral line formation!* (Extended Data Table 2). 
All 23 stars were found to have [Fe/H] < —2.3, including nine stars 
with [Fe/H] < —3 (here [A/B] =log, ,(Ny/Ng),— log, (Ny /Np)o> 
where N, /N,z refers to the number ratio of atoms of elements A 
and B in the star (* subscript) and the Sun (© subscript)). 


The most metal-poor star, SMSS J181609.62—333218.7, has 
[Fe/H] = —3.94+0.16. The abundances of an additional 22 elements 
were determined spectroscopically, including the a-elements Mg, Si, 
Ca, and Ti, and the neutron capture elements Y, Zr, and Ba (Extended 
Data Tables 3, 4, 5). 

To confirm their bulge membership, the distances and orbits of 
the stars have been determined. Using the spectroscopic tempera- 
tures and surface gravities, and an assumed mass of 0.8M, distances 
were inferred, which in nearly all cases are consistent with them being 
located within the bulge (Fig. 2). We have measured velocities for ten 
of our stars using observations taken by the OGLE-IV survey", from 
which orbits around the Galaxy have been determined in combina- 
tion with their distances and velocities (Extended Data Table 6); the 
remaining stars fall outside the OGLE footprint while other sources of 
kinematic information are too uncertain to constrain the orbits suffi- 
ciently. Seven out of the ten stars with accurate kinematics are shown 
to have tightly bound orbits, placing them in the inner regions of the 
Milky Way (Fig. 2). In particular, using a cut-off radius of 3.43 kpc as 
the radius of the bulge component", the most metal-poor star SMSS 
J181609.62—333218.7 has an orbit entirely contained within the bulge. 
Only two out of the ten stars are on much larger orbits, being merely 
halo stars currently passing through the bulge region. Extending these 
numbers to the whole sample, we can expect ~ 14 of the 23 bulge stars 
analysed here to have orbits fully within the central regions of the Milky 
Way; with the imminent arrival of kinematic data from the Gaia satellite, 
accurate orbits for all of the bulge stars will be able to be determined. 

The very first stars are predicted to have brought about the cosmic 
dawn by forming in the centres of the largest dark matter mini-haloes, 
which subsequently accreted material to become the inner regions of 
the largest galaxies'®. The typical redshift of formation for stars in the 
bulge with [Fe/H] < —1 is z= 10, in contrast to z~5 for halo stars. Of 
the stars with [Fe/H] < —3, approximately 15% are expected to have 
formed at z> 15 (refs 5, 6). Of the ten stars with accurate orbit infor- 
mation, half of them have binding energies E,.,<—8 x 10°-*km? s~?, 
which is consistent with a formation redshift of z> 15 (ref. 5). Low 
binding energies imply that the stars have been in the Galactic potential 
well for some time and it is very unlikely they have been accreted from 
a recent dwarf spheroidal merger. Their low metallicities, orbits and 
binding energies make these stars prime candidates for being direct 
descendants of the very first stars, probing a cosmic epoch otherwise 
completely inaccessible currently. Direct age determinations of these 
ancient and extremely metal-poor bulge stars from comparison with 
stellar evolutionary tracks or radioactive U or Th dating are currently 
not possible, but asteroseismic ages could possibly be inferred with the 
extended Kepler mission or future satellites. 


Research School of Astronomy and Astrophysics, Australian National University, Australian Capital Territory 2601, Australia. Institute of Astronomy, University of Cambridge, Madingley Road, 
Cambridge CB3 OHA, UK. #Warsaw University Observatory, Aleje Ujazdowskie 4, 00-478 Warszawa, Poland. “Department of Astronomy, Ohio State University, 140 West 18th Avenue, Columbus, 
Ohio 43210, USA. ‘Department of Physics and Astronomy, Division of Astronomy and Space Physics, Uppsala University, Box 516, SE-751 20 Uppsala, Sweden. School of Physics, Astronomy 
and Mathematics, Centre for Astrophysics Research, University of Hertfordshire, College Lane, Hatfield AL10 9AB, UK. ’Max-Planck-Institut ftir Astronomie, Kénigstuhl 17, D-69117 Heidelberg, 
Germany. ®Sorbonne Universités, UPMC Université Paris 6 et CNRS, UMR 7095, Institut d’Astrophysique de Paris, 98 bis Boulevard Arago, 75014 Paris, France. 9Universidad de Concepcion, 
Departamento de Astronomia, Casilla 160-C, Concepcién, Chile. !°Department of Physics, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, UK. 


484 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


Ty 


Normalized flux 
° 
0 


Normalized flux & 


Loiiiiil iui ust 


486.0 486.5 487.0 


Wavelength (nm) 


485.5 


Figure 1 | Extracts of the spectrum of the lowest-metallicity star in 

our sample. a, A section of the spectrum of SMSS J181609.62—333218.7 
(black line), the most metal-poor bulge star known. In blue is the predicted 
spectrum with the inferred stellar parameters (effective temperature 

Test = 4,809 K, log(g) = 1.93 (here surface gravity g is in cgs units), 

[Fe/H] = —3.94, [Mg/Fe] = 0.20), and the red and green lines show spectra 
with all abundances scaled to + 0.15 dex, respectively. All three predicted 
spectra were created using the 1D local thermodynamic equilibrium (LTE) 
spectrum synthesis programme, MOOG”. b, The Hf line of the same 
star, compared to three synthetic spectral line profiles*” computed with 
Tet = 4,640 K (red, dash-dot), 4,800 K (purple, continuous), and 4,960 K 
(blue, dashed). 


Given their extremely low metallicities and large formation red- 
shifts, these stars are likely to have formed from gas polluted by ejecta 
from a single or at most a few supernovae of the first stellar generation. 
A chemical composition analysis has been carried out to search for 
tell-tale nucleosynthetic signatures and possible differences from halo 
stars at the same metallicities. For most elements, the chemical com- 
positions of the 23 bulge stars are consistent with typical halo stars, 
suggesting enrichment by similar supernovae in spite of the distinct 
environments and formation redshifts. Subtle differences do exist how- 
ever, most notably in terms of the carbon abundances. None of the 
23 stars have the large observed carbon enhancements that occur 
frequently in halo stars. Applying evolutionary corrections to the sur- 
face carbon abundance to counter the mixing that occurs with mat- 
erial processed by H-burning through CNO-cycling at late stages of 
the stellar lifetime’, still only one of the stars would have had a natal 
[C/Fe] > 1. In the halo, the percentage of stars that are carbon-enhanced 
increases dramatically at lower metallicities—from 27% of stars with 
[Fe/H] < —2 up to 69% with [Fe/H] < —4 (ref. 17). From the litera- 
ture data on halo stars with similar iron abundances to our stars!” the 
probability of selecting at most one carbon-enhanced star out of 23 
halo stars is only 0.2%. Carbon-enhanced stars come in two varieties: 
those with and those without large excesses of neutron-capture ele- 
ments. The former are most likely to have been formed by mass-transfer 
from a binary companion that underwent the asymptotic giant branch 
phase. Those carbon-enhanced stars with neutron-capture excesses 
occur most frequently at metallicities of [Fe/H] >—3, whereas those 
without do not appear to have binary companions, and are more com- 
mon at the very lowest metallicities. As none of our bulge stars are 
classified as having large abundances of neutron capture elements, 
the likelihood of finding one such carbon-enhanced star out of 23 is 
7% if the frequency is the same for the bulge as for the halo. A lower 


LETTER 


ad 3E i 
Qe E 
Z oF 4 
NE 3 
“2 : 
32 E 
10 
5r a 
=. | ] 
E ol / 
» L 4 
-5L 4 
-~10F ' | 
1 1 1 L | 1 1 1 1 | 1 1 1 n ! 1 1 1 L 
-10 -5 0 5 10 
X (kpc) 
b 3 T T T T T 3 T T T T 
2b jak 4 
if i o4E j 
=) oy 
= OF : Zot E 
> N 
-1f j -1f 4 
of j oF 4 
8 flit ans ana tate oc) ea eee Te ee 
3 2 +1 0 1 2 3 «4200 05 10 15 20 25 


R (kpc) 

Figure 2 | The Galactic positions and orbits of the 23 stars observed at 
high resolution. a, Surface density map of a model of the Galactic bulge 
projected onto the X-Z (top) and X-Y (bottom) planes**, where X, Y, 
and Z are Cartesian coordinates with the origin at the Galactic Centre 
and Z perpendicular to the plane of the Galaxy. Plotted over this (filled 
black circles) are the 23 stars of this study, with distance uncertainties 
shown as error bars, and a circle of radius 3.43 kpc (white: the cut-off 
radius of the inner bulge determined from 2MASS data’*). The position 
of the Sun is shown with a red diamond, at 8.5 kpc from the Galactic 
Centre. b, Projections of the orbit of the lowest metallicity star, SMSS 
J181609.62—333218.7, both in the (R, Z) plane (right), where R is the 
radial direction, and in the plane of the orbit itself (left). 


frequency of carbon-enhanced stars in the bulge relative to the halo is 
contrary to theoretical predictions; the expected dependence of the ini- 
tial mass function on the cosmic microwave background'® would result 
in a greater number of carbon-enhanced stars near the centre of the 
Galaxy. 

The most metal-poor bulge star, SMSS J181609.62—333218.7, is at 
least an order of magnitude more iron-deficient than previously found 
low-metallicity bulge stars”!°. We have not been able to detect C in 
its spectrum, instead finding only an upper limit to its C abundance 
(Extended Data Fig. 1). This makes the upper limit on its total metallicity 
[Z/H] = —3.8 (total mass fraction of Z~2.1 x 10~°), where Z rep- 
resents the sum of all metals, placing it amongst the four most 
metal-poor stars known, along with the halo star SDSS J102915+172927 
(ref. 3). The low C measured in both these stars fall below the predicted 


26 NOVEMBER 2015 | VOL 527 | NATURE | 485 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 35 T T T T 7 
2b 4 
o> &£ 4 
5 'F : 
oF 4 
iE q 
5 1 
b N F Na Al P Cl K Sc V Mn Co Cu Ga 
C O Ne Mg Si S Ar Ca Ti Cr Fe Ni Zn 
T T T T | T T T L | T T T T | T T T T T T T T | T 
qL [Fe/H] = -3.94 4 
i 1 mod p HN 40M, | 
[ hom oon 4 - — - SN15M, 
s i ee: Oem --7->+ PISN 170M, 
r \\ AY AO oe aE a 
bs H hy ‘ : iv 1 . on 
O-' l ons WY \, aM a Ps ) 4 
_ Ln 4 | n\ A \\/ Wy \ Wat < al 
7) { so i VEN PNA il 
wv La 3 | ar | i wy \ ! ow iy 4 
SB Lh A fig ye MAR Pow tle \ fl 


A af i 


Atomic number 


Figure 3 | Chemical abundances of the 23 stars observed at high 
resolution. a, The abundance ratio of carbon versus iron ([C/Fe]), with 
respect to metallicity ([Fe/H]) measured in the observed stars (filled 

red circles, red arrow for an upper limit). The dotted lines represent 

the solar abundances. Also shown for comparison are literature metal- 
poor halo giants (small black dots™*) and more metal-rich bulge stars 
(filled blue triangles*’). b, The chemical abundance pattern of SMSS 
J181609.62—333218.7, for elements X, where X is displayed at the top of 
the figure. Each determined abundance is shown as an open black star. 
These abundances are compared to three synthetic supernovae yields: a 
pair-instability supernova of 170M. (PISN; blue, short dash”), a core- 
collapse supernova of 15Mz (SN: green, long dash”’), and a hypernova of 
40M» (HN; red, solid”). Dashed grey arrows represent expected non-LTE 
corrections; solid arrows represent measurements where only an upper 
or lower limit was possible. The error bars in a and b are estimates of the 
uncertainties in our measurements, calculated as described in Methods. 


metallicity limit for formation of low-mass stars due to metal line 
cooling”. 

We have compared the detailed chemical abundance pattern of 
SMSS J181609.62—333218.7 to primordial supernovae yields?!” 
(Fig. 3). In particular, the low Mg and Ca abundance, but higher Si 
abundance, and the absence of a pronounced odd-even abundance 
pattern rule out the possibility of enrichment by a pair-instability 
supernova resulting from a primordial star of (140-250)M>. Low 
abundances of Cr and Mn and of a-elements, combined with the 
higher abundance of Co, indicate that the polluting supernova was 
most likely to have been a primordial hypernova—an extremely 
energetic kind of supernova releasing ten times the kinetic energy of 
regular core-collapse supernovae, possibly due to the forming black 
hole having larger angular momentum”? Good agreement is found 
for a 40M. hypernova; a more stringent Zn limit would further con- 
strain the mass range. Unusual abundance ratios have been found in 
small numbers of stars in the halo—4% of halo stars with low carbon 
abundances have chemical peculiarities in at least two elements**— 
but none so far appear to have been polluted by a 40M. hypernova. 
A low [a/Fe] ratio (0.14 dex; here a indicates a-elements) at such 
low [Fe/H] is consistent with an inhomogeneous enrichment from 


486 | NATURE | VOL 527 | 26 NOVEMBER 2015 


such supernovae”, while stars with higher [a/Fe] formed from more 
well-mixed gas due to a longer time delay in forming the second 
generation of stars. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 1 May; accepted 24 August 2015. 
Published online 11 November 2015. 


1. Bromm, V., Yoshida, N., Hernquist, L. & McKee, C. F. The formation of the first 
stars and galaxies. Nature 459, 49-54 (2009). 

2.  Christlieb, N. et a/. A stellar relic from the early Milky Way. Nature 419, 
904-906 (2002). 

3. Caffau, E. et a/. An extremely primitive star in the Galactic halo. Nature 477, 
67-69 (2011). 

4. Keller, S. C. et a/. A single low-energy iron-poor supernova as the source of 
metals in the star SMSS JO31300.36-670839.3. Nature 506, 463-466 (2014). 

5. Tumlinson, J. Chemical evolution in hierarchical models of cosmic structure. II. 
The formation of the Milky Way stellar halo and the distribution of the oldest 
stars. Astrophys. J. 708, 1398-1418 (2010). 

6. Salvadori, S., Ferrara, A., Schneider, R., Scannapieco, E. & Kawata, D. Mining the 
Galactic halo for very metal-poor stars. Mon. Not. R. Astron. Soc. 401, L5-L9 
(2010). 

7.  Feltzing, S. & Gilmore, G. Age and metallicity gradients in the Galactic bulge. 
Astrophys. Space Sci. 265, 337-340 (1999). 

8. Garcia Pérez, A. E. et al. Very metal-poor stars in the outer Galactic bulge found 
by the APOGEE survey. Astrophys. J. 767, L9 (2013). 

9. Howes, L. M. et a/. The Gaia-ESO survey: the most metal-poor stars in the 
Galactic bulge. Mon. Not. R. Astron. Soc. 445, 4241-4246 (2014). 

10. Keller, S. C. et a/. The SkyMapper Telescope and the Southern Sky Survey. 
Publ. Astron. Soc. Aust. 24, 1-12 (2007). 

11. Bernstein, R., Shectman, S. A., Gunnels, S. M., Mochnacki, S. & Athey, A. E. 

MIKE: a double echelle spectrograph for the Magellan Telescopes at Las 

Campanas Observatory. Proc. SPIE 4841, 1694-1704 (2003). 

12. Gustafsson, B. et al. A grid of MARCS model atmospheres for late-type stars. 

Astron. Astrophys. 486, 951-970 (2008). 

13. Lind, K., Bergemann, M. & Asplund, M. Non-LTE line formation of Fe in 

ate-type stars — Il. 1D spectroscopic stellar parameters. Mon. Not. R. Astron. 

Soc. 427, 50-60 (2012). 

14. Udalski, A., Szymanski, M. K. & Szymanski, G. OGLE-IV: fourth phase of the 

Optical Gravitational Lensing Experiment. Acta Astron. 65, 1-38 (2015). 

15. Robin, A. C., Marshall, D. J., Schultheis, M. & Reylé, C. Stellar populations in the 

ilky Way bulge region: towards solving the Galactic bulge and bar shapes 

using 2MASS data. Astron. Astrophys. 538, A106-A120 (2012). 

16. Greif, T. H. et al. Formation and evolution of primordial protostellar systems. 

Mon. Not. R. Astron. Soc. 424, 399-415 (2012). 

17. Placco, V. M., Frebel, A., Beers, T. C. & Stancliffe, R. J. Carbon-enhanced 
metal-poor star frequencies in the Galaxy: corrections for the effect of 
evolutionary status on carbon abundances. Astrophys. J. 797, 21 (2014). 

18. Tumlinson, J. Carbon-enhanced metal-poor stars, the cosmic microwave 
background, and the stellar initial mass function in the early universe. 
Astrophys. J. 664, L63-L66 (2007). 

19. Schlaufman, K. C. & Casey, A. R. The best and brightest metal-poor stars. 
Astrophys. J. 797, 13 (2014). 

20. Frebel, A., Johnson, J. L. & Bromm, V. Probing the formation of the first 
low-mass stars with stellar archaeology. Mon. Not. R. Astron. Soc. 380, L40-L44 
(2007). 

21. Kobayashi, C., Ishigaki, M. N., Tominaga, N. & Nomoto, K. The origin of low 
[a/Fe] ratios in extremely metal-poor stars. Astrophys. J. 785, L5 (2014). 

22. Umeda, H. & Nomoto, K. Nucleosynthesis of zinc and iron peak elements in 
population Ill type Il supernovae: comparison with abundances of very metal 
poor halo stars. Astrophys. J. 565, 385-404 (2002). 

23. Nomoto, K. et al. Nucleosynthesis in black-hole-forming supernovae and 
extremely metal-poor stars. Prog. Theor. Phys. 151 (Suppl.), 44-53 (2003). 

24. Yong, D. et al. The most metal-poor stars. Il. Chemical abundances of 190 
metal-poor stars including 10 new stars with [Fe/H] < —-3.5. Astrophys. J. 

762, 26-63 (2013). 

25. Karlsson, T., Bromm, V. & Bland-Hawthorn, J. Pregalactic metal enrichment: 
the chemical signatures of the first stars. Rev. Mod. Phys. 85, 809-848 (2013). 

26. Sneden, C., Bean, J., lvans, |., Lucatello, S. & Sobeck, J. MOOG: LTE line analysis 
and spectrum synthesis. Astrophysics Source Code Library http://adsabs. 
harvard.edu/abs/20 1 2ascl.soft02009S (2012) 

27. Barklem, P. S., Piskunov, N. & O’Mara, B. J. Self-broadening in Balmer line wing 
formation in stellar atmospheres. Astron. Astrophys. 363, 1091-1105 (2000). 

28. Ness, M. et al. Young stars in an old bulge: a natural outcome of internal 
evolution in the Milky Way. Astrophys. J. 787, L19 (2014). 

29. Ryde, N. et al. Chemical abundances of 11 bulge stars from high-resolution, 
near-IR spectra. Astron. Astrophys. 509, A20-A35 (2010). 


Acknowledgements This paper includes data gathered with the 6.5-m Magellan 
Telescopes located at Las Campanas Observatory, Chile. Australian access to 
the Magellan Telescopes was supported through the Collaborative Research 
Infrastructure Strategy of the Australian Federal Government. L.M.H. and M.A. 


© 2015 Macmillan Publishers Limited. All rights reserved 


were supported by the Australian Research Council (FL110100012). A.R.C. 
acknowledges support from the European Union FP7 programme through 
ERC grant number 320360. Research on metal-poor stars with SkyMapper 

is supported through Australian Research Council Discovery Projects grants 
DP120101237 and DP150103294 (principal investigator G.S.D.C.). The OGLE 
project received funding from the NSC, Poland (MAESTRO grant 2014/14/A/ 
ST9/00121 to A.U.). 


Author Contributions The project was initiated and led by M.A. The photometric 
target selection was made by L.M.H., C.I.0. and D.M.N. using data from the 
SkyMapper telescope developed by B.PS., S.C.K., G.S.D.C., M.S.B. and P.T. The 
low-resolution spectra were obtained by L.M.H and C.I.0. The data were reduced 
and analysed by L.M.H. using software developed by A.R.C. Target selection 

for the high-resolution observations was done by L.M.H., M.A. and A.R.C. with 


LETTER 


the observations carried out by L.M.H. and D.Y.; the reduction and subsequent 
chemical analysis was completed by L.M.H. K.L. performed the non-LTE 
spectral line formation calculations, C.K. interpreted the observed chemical 
abundances in terms of supernova yields, and M.N. provided comparison bulge 
data. R.P,, A.U., M.K.S, L.S., G.P, K.U., L.W., PP, J.S., S.K. and P.M. obtained the 
OGLE observations, A.U. and M.K.S. constructed the reference images, and R.P. 
measured the proper motions. The manuscript was written by M.A., L.M.H. and 
A.R.C. with all authors contributing comments. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to L.M.H. 
(louise.howes@anu.edu.au). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 487 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Observations. Photometry of the Milky Way bulge was acquired for the EMBLA 
survey” during the commissioning period of the SkyMapper telescope in 2012 and 
2013. Stars were selected from the photometry using a combination of the g, i, and 
v bandpasses, designed to give a reliable metallicity indicator’. 

Spectroscopic follow-up observations took place during 2012-14, making 
use of the AAOmega+2dF multi-object spectrograph® on the Anglo-Australian 
Telescope. With between 350 and 400 stars observed in each field, spectra of more 
than 14,000 bulge stars have been obtained. The gratings used have a spectral 
resolving power of 1,300 in the blue (370-580 nm) and 10,000 in the red 
(840-885 nm). The data were reduced using the standard 2dfdr pipeline. Stellar 
spectra were fitted using a generative model that simultaneously accounts for 
stellar parameters (by interpolating from the AMBRE grid*’), continuum, spectral 
resolution and radial velocity. 

From the first two years of spectroscopic data, more than 50 stars were identi- 

fied as having [Fe/H] < —2.5. The high-resolution spectroscopic data of 23 stars 
presented in this Letter are the result of observations using the MIKE spectrograph 
at the Magellan Clay telescope'! on 15-17 June 2014. All observations were taken 
using a slit width of 0.7”, resulting in a resolving power of 35,000 in the blue and 
31,000 in the red. The data were reduced using the CarPy data reduction pipeline™, 
before they were normalized and summed together using the SMH software**. The 
final spectra cover 330-890 nm. 
Parameter and abundance determination. The stellar parameters (Extended 
Data Table 2) were calculated iteratively, using the original parameters from 
the low-resolution spectra as initial guesses. First, effective temperatures were 
derived by fitting the wings of the Balmer Ha and Hf lines with a synthetic 
profile (Fig. 1). These profiles were created by linearly interpolating between a 
grid of synthetic spectra’’. The best lines were fitted by a x” minimization, using 
a weighted average of the two lines—weighting was double on the Hf line, due 
to predicted LTE effects being larger for Ha (ref. 34). The difference between 
the temperatures calculated for each line was on average only 26 K. The log(g), 
microturbulence &, and [Fe/H] were then derived for that temperature, by forcing 
the Fe 1 abundance to remain constant with respect to reduced equivalent width, 
and equilibrium between the Fe 1 and Fe 1 abundances. Fe 1 and Fe 11 abundances 
were measured from the equivalent widths of a maximum of 66 Fe 1 lines and 
24 Fe 11 lines (in the case of the most metal-poor star, SMSS J181609.62—333218.7, 
these numbers are reduced to 10 Fe 1 lines and 4 Fe m lines). Finally a non-LTE 
correction is applied to the Fe 1 abundance, calculated by taking the average 
of the line-by-line corrections!”. This correction forces an offset between the 
Fe 1 and Fe 11 abundances, thus replacing the initial equilibrium. This process 
is repeated until the parameters converge on a solution. Throughout we use 
the 1D MARCS model atmospheres”, and a shortened version of the Gaia- 
ESO line list, with extra lines supplemented from ref. 35 due to our wider 
wavelength coverage. The stellar abundances are referenced to the solar abun- 
dances of ref. 36. This analysis method was tested on seven halo stars from 
the literature”, and the offsets found were Tog= +28 K, log(g) = —0.2, and 
[Fe/H] = —0.08 (literature values minus our values). 

The abundances were measured using the equivalent widths of atomic lines 
(that were all on the linear part of the curve-of-growth), except in the case of C 
(measured from the C-H molecular bands at 431.3 nm and 432.3 nm) and Ba 
(synthesized in order to account for hyperfine splitting). Non-LTE corrections 
were calculated for Li (ref. 37), Na (ref. 38), Mg, and Ca, and applied to the 
individual line abundances. The literature halo abundances of Mg and Ca (ref. 
25) shown in Fig. 3 have also had a NLTE correction applied, in order to ensure 
a fair comparison. 30 upper limits were derived for some elements in those 
stars where the lines were too weak to be detected (Extended Data Table 3). 
The abundance offsets compared to the literature values averaged 0.10 + 0.19 
across those elements measured in common. Owing to wavelengths covered in 
the SkyMapper metallicity filter, it is possible that stars with extremely high C 
abundance appeared to be more metal-rich, and so were not selected. However, 
a similar study of metal-poor stars discovered in the halo with SkyMapper®” 
found the fraction of C-enhanced stars was identical to that reported in previous 
surveys". Furthermore, we followed up 14,000 stars with intermediate resolution 
spectra, and determined metallicities using those spectra. The majority of the 
stars observed had [Fe/H] + —1.0 and included some that had solar metallicities, 
so it is highly unlikely that we missed any C-enhanced extremely metal-poor 
stars in our selection. 

The systematic uncertainties in the temperature determinations were estimated 
to be +100K, and the statistical uncertainties averaged +125 K, so when these are 
combined in quadrature we conclude the total uncertainty to be +160 K. The & 
uncertainties are estimated to be £0.2, mostly due to systematics. 


The standard errors of the individual line abundances of Fe 1 and Fe 11 were 
combined in quadrature to evaluate the log(g) uncertainties. The differences 
between the [Fe/H] values when varying the temperature, surface gravity, and 
microturbulence by their respective errors were combined in quadrature with the 
standard error of the Fe 11 lines to produce the [Fe/H] uncertainties. The individual 
abundance errors were also calculated using this method, using the standard error 
of the individual abundances for the lines of that particular element. 

For SMSS J181609.62—333218.7, which has [Fe/H] = —3.94, a measurement 
of Na was not possible, owing to the 818.3 nm and 819.4 nm lines being too weak, 
and the Na D lines (588.9 nm and 589.5 nm) being partly blended with interstel- 
lar Na lines. We have derived a range of possible values for this star, taking the 
upper limit from the non-detection at 819.4nm, and the lower limit from fitting 
a Gaussian to the Na D lines, taking into account the interstellar Na. 

Distances and orbital parameters. Distances to the stars were calculated by com- 
paring the absolute and apparent bolometric magnitudes. The absolute magnitudes 
were recovered from the relation My. = M,, — 2.5log(Ly./L,,), where the lumino- 
sities are calculated using L,./L, = 4noT ‘*Mz.G/ 10'°88*) taking Mx = (0.8+ 
0.2)Mz for all stars. The apparent bolometric magnitudes are reconstructed 
from the 2MASS JHK, magnitudes (Extended Data Table 1), assuming reddening"! 
(as no more-recent reddening catalogue covers all 23 stars), via the methodology 
of ref. 42. The proper motions are based on I band images taken during the 
OGLE-IV" observations of the Galactic bulge. Relative proper motions were 
derived from multiple epochs of data for each field*?, and the uncertainties are a 
combination of statistical and systematic (for each star, the systematic uncertainty 
is estimated to be ~0.4mas yr_!). These were converted into absolute proper 
motions by adding the predicted average bulge motion for each field, calculated 
using the Besancon Galaxy model**. The orbits were calculated using the python 
package galpy*® and the galactic potential assumed in these calculations was a 
three-component Milky Way-like potential*. To model the uncertainty distribu- 
tions, we sampled 1,000 orbits using a Monte Carlo simulation, assuming a normal 
distribution for the uncertainties of the input parameters. The results of this are 
included in Extended Data Table 6. One star, SMSS J175455.52 — 380339.3, has an 
unbound E,.; and impractically large orbital parameters, suggesting that one or 
more of our input parameters need to be changed. 

Code availability. All codes used to analyse the data presented are publicly availa- 
ble. In particular, the 1D LTE analysis used was made possible with the line analysis 
and spectrum synthesis code MOOG”. 


30. Sharp, R. et a/. Performance of AAOmega: the AAT multi-purpose fiberfed 
spectrograph. Proc. SPIE 6269, 1-13 (2006). 

31. de Laverny, P., Recio-Blanco, A., Worley, C. C. & Plez, B. The AMBRE project: 

a new synthetic grid of high-resolution FGKM stellar spectra. Astron. Astrophys. 
544, A126-A137 (2012). 

32. Kelson, D. MIKE pipeline http://code.obs.carnegiescience.edu/mike (2014). 

33. Casey, A. R. A tale of tidal tails in the Milky Way. Preprint at http://arXiv.org/ 
abs/1405.5968 (2014). 

34. Barklem, P. S. Non-LTE Balmer line formation in late-type spectra: effects of 

atomic processes involving hydrogen atoms. Astron. Astrophys. 466, 327-337 

(2007). 

35. Norris, J. E. et al. The most metal-poor stars. |. Discovery, data, and 

atmospheric parameters. Astrophys. J. 762, 25 (2013). 

36. Asplund, M., Grevesse, N., Sauval, A. J. & Scott, P. The chemical composition of 

he Sun. Annu. Rev. Astron. Astrophys. 47, 481-522 (2009). 

37. Lind, K. Asplund, M. & Barklem, P. S. Departures from LTE for neutral Li in 

ate-type stars. Astron. Astrophys. 503, 541-544 (2009). 

38. Lind, K., Asplund, M., Barklem, P. S. & Belyaev, A. K. Non-LTE calculations for 
neutral Na in late-type stars using improved atomic data. Astron. Astrophys. 
528, Al103-A112 (2011). 

39. Jacobson, H. R. et a/. High-resolution spectroscopic study of extremely 
metal-poor star candidates from the SkyMapper survey. Astrophys. J. 807, 
171 (2015). 

40. Lucatello, S. et al. The frequency of carbon-enhanced metal-poor stars in the 

Galaxy from the HERES sample. Astrophys. J. 652, L37-L40 (2006). 

41. Schlegel, D. J., Finkbeiner, D. P. & Davis, M. Maps of dust infrared emission for 

use in estimation of reddening and cosmic microwave background radiation 

oregrounds. Astrophys. J. 500, 525-553 (1998). 

42. Casagrande, L., Portinari, L. & Flynn, C. Accurate fundamental parameters 

for lower main-sequence stars. Mon. Not. R. Astron. Soc. 373, 13-44 

(2006). 

43. Poleski, R. et al. An asymmetric streaming motion in the Galactic bulge 

X-shaped structure revealed by OGLE-III proper motions. Astrophys. J. 776, 

76 (2013). 

44. Robin, A. C., Reylé, C., Derriére, S. & Picaud, S. A synthetic view on structure 

and evolution of the Milky Way. Astron. Astrophys. 409, 523-540 (2003). 

45. Bovey, J. galpy http://github.com/jobovy/galpy (2015). 

46. Bovy, J. galpy: A python library for galactic dynamics. Astrophys. J. 216, 

29 (2015). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Normalised Flux 


0.5 n n n ! n 1 1 | 1 1 1 ! ! 1 1 ! 1 1 n ! n 1 n ! 1 1 n ! 
430.0 430.2 430.4 430.6 430.8 431.0 431.2 431.4 
Wavelength (nm) 


Extended Data Figure 1 | The C-H band of SMSS J181609.62 — 333218.7. The C-H band is used to derive an upper limit for C in our most metal-poor 
star, SMSS J181609.62 — 333218.7. Synthetic spectra with abundances of [C/Fe] = 0.06 (blue) and [C/Fe] = 0.56 (red) are shown for comparison. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Coordinates and 2MASS photometry of the 23 stars observed 


Name (SMSS) RA(°) Dec(°) JU(°) 6(°) J (mag) H (mag) Kg (mag) 
J173823.38-145701.1 264.597 -14.950 11.1 8.7 10.85 10.22 10.03 
J182048.26-273329.2 275.201 -27.558 5.0 -6.1 12.94 12.42 12.25 
J183744.90-280831.1 279.437 -28.142 6.2 -9.7 12.29 11.69 11.33 
J183647.89-274333.1 279.200 -27.726 6.5 -9.3 10.68 10.03 9.77 
J183812.72-270746.3 279.553 -27.130 7.1 -9.3 13.38 12.79 12.61 
J183719.09-262725.0 279.330 -26.457 7.7 = =-8.9 12.79 12.19 12.03 
J184201.19-302159.6 280.505 -30.367 4.5 -11.5 14.52 14.08 14.00 
J184656.07-292351.5 281.734 -29.398 5.9 -12.0 13.12 12.61 12.51 
J181406.68-313106.1 273.528 -31.518 08 -6.6 12.12 11.56 13.30 
J181317.69-343801.9 273.324 -34.634 357.9 -7.9 13.09 12.55 12.50 
J181219.68-343726.4 273.082 -34.624 357.9 -7.7 12.80 12.28 12.15 
J181609.62-333218.7 274.040 -33.539 359.2 -7.9 13.39 12.84 12.71 
J181634.60-340342.5 274.144 -34.062 3588 -8.3 12.56 11.99 11.90 
J175544.54-392700.9 268.936 -39.450 352.0 -7.1 13.71 13.19 13.09 
J175455.52-380339.3 268.731 -38.061 353.1 -6.3 11.98 11.39 11.26 
J175746.58-384750.0 269.444 -38.797 3528 -7.2 13.09 12.60 12.51 
J181736.59-391303.3 274.402 -39.218 354.2 -108 12.06 11.54 1137 
J181505.16-385514.9 273.772 -38.921 354.2 -10.2 13.63 13.15 13.09 
J181921.64-381429.0 274.840 -38.241 355.2 -10.6 13.64 13.13 13.03 
J175722.68-411731.8 269.345 -41.292 350.5 -8.3 13.85 Tuco i ae | 
J175021.86-414627.1 267.591 -41.774 3494 -74 11.74 11.23 11.17 
J175636.59-403545.9 269.152 -40.596 351.1 -7.9 12.86 12.29 12.19 
J175433.19-411048.9 268.638 -41.180 3504 -78 11.94 11.43 11.33 


RA, right ascension; Dec., declination; / and b, Galactic longitude and latitude, respectively. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 2 | Stellar parameters of the 23 stars observed 


Name Vhelio da Ter logg [Fe/H] & [a/Fe] 
(SMSS) (kms~') (kpc) (K) (cgs) (dex) (kms~') = (dex) 
J173823.38-145701.1 46.1 8.5 4599 0.99 -3.36 2.30 0.12 
J182048.26-273329.2 51.5 6.0 4949 2.22 -3.48 1.90 0.37 
J183744.90-280831.1 -132.6 17.6 4597 0.98 -2.92 2.05 0.33 
J183647.89-274333.1 -381.4 6.6 4649 1.17 -2.48 2.50 0.30 
J183812.72-270746.3 155.3 12.3. 4873 1.74 -3.22 1.81 -0.01 
J183719.09-262725.0 -244.7 10.0 4791 1.64 -3.18 1.81 0.32 
J184201.19-302159.6 171.8 9.6 5136 2.55 -2.84 1.96 0.30 
J184656.07-292351.5 91.0 95 4857 1.93 -2.76 1.83 0.34 
J181406.68-313106.1 4.9 9.3 4821 1.48 -2.82 1.96 0.22 
J181317.69-343801.9 139.3 6.5 5015 2.25 -2.28 1.48 0.41 
J181219.68-343726.4 -386.2 8.0 4873 1.94 -2.50 1.93 0.32 
J181609.62-333218.7 27.4 10.4 4809 1.93 -3.94 1.60 0.14 
J181634.60-340342.5 -170.3 10.5 4821 1.61 -2.46 1.79 0.06 
J175544.54-392700.9 -279.6 13.5 4857 1.83 -2.65 1.60 O32 
J175455.52-380339.3 23.5 13.5 4714 1.10 -3.36 1.80 0.08 
J175746.58-384750.0 -59.4 9.1 5064 1.96 -2.81 2.36 0.29 
J181736.59-391303.3 -177.9 15.7 4612 1.05 -2.59 2.09 32 
J181505.16-385514.9 202.1 5.0 4962 2.73 -3.29 2.10 0.35 
J181921.64-381429.0 -97.7 11.2 4917 2.02 -2.72 1.94 0.30 
J175722.68-411731.8 63.8 12.4 4894 1.97 -2.88 2.02 0.19 
J175021.86-414627.1 181.4 4.1 5015 2.12 -2.60 155 0.30 
J175636.59-403545.9 -28.8 9.8 4934 1.79 -3.21 1.96 0.20 


J175433.19-411048.9 -229.3 5.6 4912 1.91 -3.26 1.94 0.35 


Symbols: Vbelio, heliocentric velocity; do, distance from the Sun to the star; Tes, effective temperature; log(g), stellar surface gravity; &, microturbulence; [a/Fe] =([Mg/Fe] + [Ca/Fe] + [Ti/Fe])/3. Aver- 
age uncertainties: velocity, 1.0kms~!; distance, 3.0 kpc; temperature, 160K; microturbulence, 0.2 dex; log(g), 0.14 dex; [Fe/H], 0.09 dex; [a/Fe], 0.13 dex. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 3 | Chemical abundances measured for each star, C to Ca 


Name (SMSS) A(Li) [C/Fe] [Na/Fe] [Mg/Fe] [AIl/Fe] [Si/Fe] [K/Fe] [Ca/Fe] 
J173823.38-145701.1 0.49 0.04 0.17 -0.78 O27 0.12 
J182048.26-273329.2 0.98 -0.28 0.54 -0.63 0.96 0.30 
J183744.90-280831.1 0.16 -0.20 -0.28 0.44 -0.52 0.58 0.36 0.25 
J183647.89-274333.1 -0.47 -0.24 0.33 -0.66 0.51 0.18 
J183812.72-270746.3 0.93 0.22 -0.39 0.05 -1.23 0.14 0.03 
J183719.09-262725.0 0.40 -0.19 0.47 -0.77 0.36 0.41 0.25 
J184201.19-302159.6 0.34 -0.38 0.26 -0.89 0.38 0.53 0.37 
J184656.07-292351.5 1.04 0.08 -0.30 0.41 -0.95 0.36 0.58 0.28 
J181406.68-313106.1 -0.51 0.18 0.23 -0.94 0.32 0.16 
J181317.69-343801.9 1.05 0.17 -0.33 0.53 -0.82 0.33 0.63 0.34 
J181219.68-343726.4 1.01 0.19 -0.22 0.30 -0.86 0.25 0.31 
J181609.62-333218.7 <0.06 -0.01<0.91° 0.20 -1.08 0.54 0.00 
J181634.60-340342.5 -0.10 -0.53 0.05 -1.08 0.11 0.21 0.03 
J175544.54-392700.9 0.87 0.12 -0.32 0.29 -0.88 0.34 0.36 0.29 
J175455.52-380339.3 -0.64 0.06 -0.88 0.30 0.03 
J175746.58-384750.0 -0.04 0.37 -1.10 0.44 0.24 
J181736.59-391303.3 -0.28 -0.11 0.38 -0.69 0.53 0.53 0.26 
J181505.16-385514.9 0.23 -0.23 0.21 -0.96 0.15 0.36 
J181921.64-381429.0 1.04 0.32 -0.24 0.28 -0.82 0.54 0.44 0.26 
J175722.68-411731.8 0.42 -0.42 0.21 -0.70 0.48 0.14 0.13 
J175021.86-414627.1 0.98 0.23 -0.37 0.30 -0.82 0.42 0.28 
J175636.59-403545.9 0.93 0.65 0.30 -0.76 0.45 0.11 
J175433.19-411048.9 0.92 0.24 -0.03 0.40 -0.74 0.43 0.44 0.32 


A(Li) is the logarithmic abundance of lithium. All abundances are derived using LTE, except for Li, Na, Mg, and Ca, where non-LTE corrections have been applied. Average uncertainties: Li, 0.20; C, 0.25; 
Na, 0.20; Mg, 0.16; Al, 0.22; Si, 0.21; K, 0.17; Ca, 0.12. 
0.01 is the lower limit, and 0.91 is the upper limit; see Methods for details. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 4 | Chemical abundances measured for each star, Sc to Cu 


Name (SMSS) [Sc/Fe] [Ti/Fe] [Cr/Fe] [Mn/Fe] [Co/Fe] [Ni/Fe] [Cu/Fe] 
J173823.38-145701.1 -0.09 -0.22 -0.80 G.1F -0.21 <0.96 
J182048.26-273329.2 0.16 -0.51 -0.97 0.24 -0.33 <1.33 
J183744.90-280831.1 0.04 0.20 -0.23 -0.38 0.36 0.14 <0.29 
J183647.89-274333.1 0.14 0.34 -0.27 -0.35 0.01 0.02 -0.43 
J183812.72-270746.3 -0.20 -0.51 -0.32 0.22 -0.08 <1.06 
J183719.09-262725.0 0.18 0.14 -0.33 -0.34 0.23 0.23 <1.10 
J184201.19-302159.6 -0.03 0.20 -0.24 -0.57 0.35 -0.02 <0. ¢2 
J184656.07-292351.5 0.11 0.28 -0.19 -0.31 0.11 0.07 <0.45 
J181406.68-313106.1 0.08 0.19 -0.30 -0.60 0.22 0.09 <0.50 
J181317.69-343801.9 0.11 0.34 -0.19 -0.08 0.12 0.10 <IL13 
J181219.68-343726.4 0.18 0.31 -0.15 -0.14 0.22 O22 <0.17 
J181609.62-333218.7 0.13 -0.65 -1.28 O13 -0.11 <L57 
J181634.60-340342.5 -0.26 0.04 -0.24 -0.35 -0.20 -0.03 <-0.05 
J175544.54-392700.9 -0.05 0.32 -0.22 -0.18 0.24 -0.05 <0.32 
J175455.52-380339.3 0.02 -0.47 -1.05 0.07 -0.12 <0.85 
J175746.58-384750.0 0.12 0.19 -0.44 -0.59 0.09 -0.01 <0.79 
J181736.59-391303.3 0.15 0.24 -0.31 -0.34 0.00 0.08 <0.10 
J181505.16-385514.9 0.40 0.38 -0.54 -0.81 0.17 -0.11 <12 
J181921.64-381429.0 0.01 0.32 -0.17 -0.44 0.33 0.25 <0.50 
J175722.68-411731.8 -0.22 0.15 -0.31 -0.48 -0.04 0.04 <0.82 
J175021.86-414627.1 -0.23 0.26 -0.27 -0.31 0.16 0.16 <0.27 
J175636.59-403545.9 -0.28 0.06 -0.29 -0.72 0.17 -0.26 <0.95 


J175433.19-411048.9 0.22 -0.52 -0.91 0.29 0.02 <1.08 


All abundances are derived using LTE. Average uncertainties: Sc, 0.10; Ti, 0.10; Cr, 0.21; Mn, 0.25; Co, 0.23; Ni, 0.19; Cu, 0.25. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 5 | Chemical abundances measured for each star, Zn to Eu 
Name (SMSS) [Zn/Fe] [Sr/Fe] [Y/Fe] [Zr/Fe] [Ba/Fe] [La/Fe] [Eu/Fe] 
J173823.38-145701.1 0.66 0.03 0.02 0.23 -0.04 -0.10 


J182048.26-273329.2 <1.01 -0.47 0.03 <1.33 
J183744.90-280831.1 0.27 -0.29 -0.32 0.03 -0.31 <0.12 
J183647.89-274333.1 0.23 0.18 -0.20 0.45 O13 0.17 0.82 
J183812.72-270746.3 <0.79 -1.03 -0.70 <0.77 
J183719.09-262725.0 0.48 0.04 0.53 -0.51 <1.03 
J184201.19-302159.6 <1.15 -0.20 0.04 0.70 0.16 <0.94 
J184656.07-292351.5 0.48 -0.26 -0.51 0.14 -0.32 <0.36 
J181406.68-313106.1 0.42 -1.61 -0.72 <0.32 
J181317.69-343801.9 0.17 Gly 0.11 0.34 0.22 -0.09 0.15 
J181219.68-343726.4 0.33 -0.06 -0.11 0.09 0.13. <0.90 0.48 
J181609.62-333218.7 <1.40 -0.85 0.23 <-0.66 <1.09 0.91 
J181634.60-340342.5 0.21 -0.25 -0.65 -0.21 -0.32 -0.14 -0.11 
J175544.54-392700.9 0.36 -0.10 -0.2 0.38 -0.11 -0.15 0.21 
J175455.52-380339.3 0.63 0.47 0.01 0.14 -0.57 <0.66 
J175746.58-384750.0 -0.21 0.04 0.91 C23 -<1,20 0.65 
J181736.59-391303.3 0.23 -0.14 = -0.47 0.14 -0.28 8 <1.19 0.21 
J181505.16-385514.9 <0.95 -0.19 0.14 0.71 0.04 <0.54 0.96 
J181921.64-381429.0 0.42 -0.21 -0.14 0.51 -0.01 0.48 0.59 
J175722.68-411731.8 <0.95 -0.30 -0.30 0.24 -0.19 <0.63 0.52 
J175021.86-414627.1 0.41 -0.14 = -0.40 0.25 -0.08 <0.60 0.23 
J175636.59-403545.9 0.86 0.55 0.24 ue al -0.95 <1.21 
J175433.19-411048.9 0.50 -0.81 -0.29 -0.42 <0.91 


All abundances are derived using LTE. Average uncertainties: Zn, 0.10; Sr, 0.20; Y, 0.12; Zr, 0.12; Ba, 0.17; La, 0.15; Eu, 0.16. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 6 | Orbital parameters 


Name (SMSS) La cos 6 Ls Mean rperi Mean rap Mean Mean Zmaz  Etot 
(masyr—!) (masyr~!) (kpc) (kpc) Eccentricity (kpc) (104 km? s?) 

J182048.26-273329.2 -4.10+0.52 -6.38+0.51 0.5 33 ras Bae | Ons. aos Bile ele 
J184201.19-302159.6 -0.38+0.90 -0.82+0.90 127) 66772 72s Ae de uy 
J184656.07-292351.5 1.174089 -2.32+0.89 1.2752 4g 70 065707, 822555 Si a a 
J181406.68-313106.1 2.28+0.52 -8.25 40.52 1.1 +33 sa, O63 2718 -9.5 +38 
J181219.68-343726.4 -2.42 41.14 -1.2941.14 07 +°8 he arene oot,  FT2aes; 3474 
J181609.62-333218.7 -4.14 +0.64 -3.74+0.64 107433 34S O57. Lots oo Bee 
J181634.60-340342.5 1.92 +0.62 -0.31 40.62 1.9 +?° ca ae OCs oe ty ee 
J175544.54-392700.9 0.03 +1.49 -0.35 41.46 1.7 +73 eae BLT yee soe: case 
J175455.52-380339.3  1.9841.14 4.764114 55748 is oe |6ele 66a 
J175746.58-384750.0 1.86+1.25 0.1741.25 1.8 +19 Sa ey (eo - Sa a A ace 


Symbols: z,cosé and ps are the proper motions in equatorial coordinates; rperi and rap are the pericentric and apocentric radii of the orbit, respectively, Zmax is the maximum distance the orbit reaches 
above/below the Galactic plane, and Ejo¢ is the total energy of the orbit. All values given here are the mean values from the Monte Carlo simulation of 1,000 orbits. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15731 


Ubiquitous time variability of integrated stellar 


populations 


Charlie Conroy!, Pieter G. van Dokkum? & Jieun Choi! 


Long-period variable stars arise in the final stages of the 
asymptotic giant branch phase of stellar evolution. They have 
periods of up to about 1,000 days and amplitudes that can 
exceed a factor of three in the I-band flux. These stars pulsate 
predominantly in their fundamental mode’, which is a function 
of mass and radius, and so the pulsation periods are sensitive to 
the age of the underlying stellar population*. The overall number 
of long-period variables in a population is directly related to their 
lifetimes, which is difficult to predict from first principles because 
of uncertainties associated with stellar mass-loss and convective 
mixing. The time variability of these stars has not previously 
been taken into account when modelling the spectral energy 
distributions of galaxies. Here we construct time-dependent stellar 
population models that include the effects of long-period variable 
stars, and report the ubiquitous detection of this expected ‘pixel 
shimmer’ in the massive metal-rich galaxy M87. The pixel light 
curves display a variety of behaviours. The observed variation of 
0.1 to 1 per cent is very well matched to the predictions of our 
models. The data provide a strong constraint on the properties of 
variable stars in an old and metal-rich stellar population, and we 
infer that the lifetime of long-period variables in M87 is shorter 
by approximately 30 per cent compared to predictions from the 
latest stellar evolution models. 

In typical massive galaxies with ~10"! stars, the variation in the 
total light due to long-period variables will be small, as the summed 
light curves of many such stars effectively cancel each other out (with 
random phases, the net effect scales as N~'/”, where N is the number 
of stars). If the light is spread out over many (for example, ~10*-10°) 
pixels, then the number of stars per pixel can range from ~10* to 
10’ and in this regime the number of asymptotic giant branch stars 
per pixel is small and governed by Poisson statistics. This ‘semi- 
resolved’ regime is well known’, and the expected surface brightness 
fluctuations due to Poisson statistics of rare luminous stars have 
been observed and studied in several hundred nearby galaxies**. 
We expect in this regime to be able to detect the presence of variable 
stars through the time dependence of the pixel flux (that is, the pixel 
light curve): essentially, every pixel is expected to ‘shimmer’ on time- 
scales of several hundred days. 

In order to quantify the expected pixel shimmer, we created a stellar 
population model at solar metallicity that included the time-dependent 
effect of long-period variables. We started with a new library of 
stellar isochrones (J.C. et al., submitted) that densely samples fast 
phases of stellar evolution, and assigned periods to evolved stars 
assuming that they pulsate in the fundamental mode‘. We then used 
observations of variable stars in the Galactic bulge from the OGLE 
survey to estimate a period-amplitude relation in the I band?". A 
smooth surface brightness model of the giant elliptical galaxy M87 
was used to specify the luminosity within pixels of size 0.2” x 0.2”. 
The pixel luminosity was used to normalize the weights in the iso- 
chrone assuming a Salpeter initial mass function!’. For each pixel 
the number of giants was drawn from a Poisson distribution, and the 


time evolution of the flux for each giant was given by its associated 
period and amplitude and initialized with a random phase. An illus- 
tration of this time-dependent model for M87 is shown in Fig. 1. The 
variable-star part of our model has a tunable parameter, the long- 
period variable star weight, which can be interpreted as the typical 
lifetime of such stars. Further details regarding the modelling are 
provided in Methods. 

We sampled the model with the same cadence and applied the same 
photon counting uncertainties as used for existing observations of 
M87 (see below). The resulting model pixel light curves are shown 
in Fig. 2 (blue lines). These pixels were selected to have peak-to-peak 
flux variation >1.5%. While rising and falling curves are clearly seen, 
one also sees that a <100-d observing window can by chance sample 
a light curve at a phase that appears relatively flat. A >200-d observing 
cadence would clearly be ideal for observing the effects of long-period 
variables in the integrated light of nearby galaxies. 

To test these expected variations, we analysed archival data of the 
galaxy M87 from the Hubble Space Telescope (HST) collected over 
72 d in 2005'*-'’. Imaging was obtained in both the F606W and 
F814W filters with the Advanced Camera for Surveys. We focused our 
analysis on the F814W imaging as the F606W data were generally of 
lower quality (owing both to a shorter exposure time and the fact that 
only a single exposure was obtained per visit, which made it difficult to 
clean the images of blemishes such as hot pixels and cosmic rays). The 
data were processed via the standard HST pipeline. In total 52 separate 
images, each with a depth of 1,440 s, were considered in this analysis. 
Globular clusters in the field were used to refine the astrometric align- 
ment with subpixel shifts. Accurate subtraction of the background 
was achieved with several additional corrections to the standard HST 
pipeline, as detailed in Methods. Pixels that deviated by more than 


Od 20d 40d 60d 80d 100d 120d 


Figure 1 | Illustration of pixel shimmer. Model prediction of the effect 
of long-period variables on integrated light. a, A smooth model for the 
surface brightness profile of M87. b, The flux at time t= 0 divided by 
the mean flux over 1,000 d within each pixel. c, Zoom-in on the lower 
left corner (boxed in b), showing snapshots at 20-d intervals. Notice the 
coherent variation in brightness of individual pixels. 


1Department of Astronomy, Harvard University, Cambridge, Massachusetts, USA. 2Astronomy Department, Yale University, New Haven, Connecticut, USA. 


488 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


1 


-2 


Flux variation (%) 
oO 


b&b 4 o 3 w» 
UU LE 


0 
-1 
-2 
2 
1 
0 
-1 
-2 
0 50 1 00 1 50 
Time (d) 


Figure 2 | Simulation of pixel light curves. a—e, Each panel shows 
modelled relative flux variation over 200 d for a randomly chosen pixel 
selected to have a peak-to-peak flux variation of >1.5%. The underlying 
stellar population is old and metal-rich, and includes long-period 
variable stars. In each panel, the noise-free model (dashed blue line) is 
compared to a simulation of the M87 data, including the photon counting 
noise (lo errors) and cadence of the observations (filled black circles and 
error bars), and a boxcar average of the simulated data (red lines and lo 
error bars). Also shown in each panel is Ni py, the number of long-period 
variable stars per pixel with periods >150 d. 


30% from a smooth model of the light profile were masked. This effec- 
tively removed all visible globular clusters, background galaxies, the 
chip gap, and edge effects. We also masked the central region and the 
well-known jet in M87. The data were binned 4 x 4 to 0.2” x 0.2” in 
order to reduce the spatial coherency imposed by the point-spread 
function (PSF; note that the models were also spatially binned and 
have been convolved by the PSF in order to emulate the observations 
as closely as possible). 

Example pixel light curves for M87 are shown in Fig. 3. The error 
bars represent photon counting uncertainties only; the solid line is a 
5-point boxcar averaging of the data. We detect coherent variation in 
the pixel light curves that is qualitatively consistent with our model 
expectations. These examples were chosen to highlight the level of 
variety seen in the data. We note that a significant fraction of pixel 
light curves show no evidence of variation within the noise limits of 


LETTER 


the data. We show below that this is expected if long-period variables 
are the source of the variation. 

We have quantified the pixel light curves by fitting each curve with 
a linear function; the best-fit slope and uncertainty were recorded. 
The resulting distribution of slopes (in units of the uncertainty) is 
shown in Fig. 4. We find that 24% of pixels (48,100 out of 202,000) 
show >2o evidence for variation. In our model there are on average 
1.5 long-period variable stars responsible for variation in each >20 
detection. This implies a statistical detection of ~72,000 variable 
stars in M87. When averaged over the central 1’ x 1’ field of view, 
the model predicts on average 0.5 variable stars per pixel. In Fig. 4 
we compare the observations to the model predictions. We show 
the sensitivity of the model light curve statistics to both the stellar 
population age and variable star parameters, and also the posterior 
probability distributions that result from fitting the model to the 
observed histogram when allowing the age and relative variable star 
weight to vary (we do not include the tails of the distribution in the 
fit as the data are slightly asymmetric beyond |slope/error| +7). The 
variable star weight is an overall factor controlling the contribution 
of variable stars to the integrated light relative to the predictions of 
a stellar evolution model (see Methods for details). The pixel light 
curves provide a strong constraint on a combination of the age and 
variable star weight. The dashed line in Fig. 4d shows the best-fit 
age estimated from modelling the integrated light spectrum of the 
central region of M87’°, which allows us to break the degeneracy 
between age and variable star weight. It is noteworthy that the best-fit 
long period variable star weight is less than one, suggesting that such 
stars in M87 may have shorter lifetimes than current solar-metallicity 
stellar evolution models predict (see Methods for a discussion of the 
effects of metallicity). 

This is not the first detection of time variability in the pixel 
fluxes of nearby galaxies; previous work predicted the occurrence 
of a gravitational microlensing signal at the pixel level'’, which was 
subsequently observed'’. Novae have also been observed in nearby 
galaxies'®, and indeed we identified ~15 novae through visual 
inspection of the pixel light curves for M87. However, novae and 
microlensing events are rare (though bright) events. An important 
distinguishing feature of the time variation caused by long-period 
variable stars is the ubiquity—as Fig. 4 shows, 24% of the pixels show 
>2¢ evidence for variation. 

There are relatively few constraints on the stellar evolutionary 
phase that gives rise to long-period variables. The best constraints to 
date on this phase are confined to the Magellanic Clouds, which have 
sub-solar metallicities characteristic of low-mass galaxies. The obser- 
vations reported here have provided a direct constraint on this impor- 
tant stellar evolutionary phase in a massive, high-metallicity galaxy. 
New stellar evolution models over-predict the lifetimes of long-period 
variables by approximately 30% if a spectroscopic age for M87 of 
10 Gyr is adopted. An older mean population age would reduce the 
mild tension between the models and observations. Constraints such 
as these on highly evolved, luminous stars are essential for interpret- 
ing light from more distant, massive and metal-rich galaxies across 
the Universe. 

The detection of time variation in the integrated light of nearby 
galaxies opens the way to deriving stellar population ages in these 
systems by a completely different approach from conventional tech- 
niques. In the future, one could imagine high cadence observations 
of nearby galaxies on >100-d baselines being performed to detect the 
period distribution of long-period variables by analysis of the power 
spectra of the time series data. This technique is not limited to old 
stellar systems; on the contrary, younger systems would show con- 
siderably greater temporal variation. For example, on the basis of our 
models, we expect that 4%, 14%, and 22% of pixels with 10° stars would 
show >1% absolute flux changes over 100 d for ages of 10!°, 10°, and 
10° yr, respectively. The larger effect at younger ages is due primarily 
to the larger fractional contribution of long-period variables to the 


26 NOVEMBER 2015 | VOL 527 | NATURE | 489 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


1.0 


Flux variation (%) 


Flux variation (%) 


20 40 
Time (d) 
Figure 3 | Observed pixel light curves for M87. a-f, Each panel shows 
observed relative flux variation over 72 d for a different pixel in M87 
(filled black circles). These pixels were selected to highlight the variety 


of morphology of the light curves, including rising, falling, periodic, and 


60 20 


total light (see Methods for details). It would therefore be relatively 
straightforward to perform similar studies on nearby spiral galaxies, 
where the signal would be much stronger. At a basic and fundamental 


Time (d) 


40 60 20 40 
Time (d) 
peculiar curves. We unambiguously detect the ‘pixel shimmer’ due to the 
contribution of long-period variables to the integrated light. Errors represent 
1o photon counting uncertainties. Red lines and error bars are 5-point 


boxcar averages of the data. 


60 80 


level, each pixel of an observed galaxy varies measurably in time, and 
this variation encodes unique information on its underlying stellar 
population. 


10° 
1075 


10 E 


Normalized number 


10-3 E 


104 


Figure 4 | Statistics of the pixel 
light curves. a—c, Normalized 
distributions of the best-fit linear 
slope of the pixel light curve in 

units of the 1o uncertainty on the 
slope (slope/error). The data (black 
line in a) are compared to several 
models, including a variable-star- 
free model (labelled ‘noise in a), and 
models varying the age (b), variable 
star amplitude (c), and weight (c). 
Models with varying age and variable 
star weight (‘LPV weight’) were fitted 
to the observed histogram, and the 
lo and 2c confidence limits on these 


parameters are shown in d (black 


10° 


1071 E 


10-2 E 


Normalized number 


10-3 


log[LPV weight] 


\ \ 10+ 


and red lines, respectively). The best- 
fit model is shown as a red line in a-c 
and a black cross in d. The vertical 
dashed line in d indicates the best-fit 
age from fitting the integrated light 
spectrum. In c is shown the effect of 
doubling the amplitudes of the long- 
period variables (A x 2; blue line) 
and the weights (W x 2; green line). 


Best-fit 


1 1 
9.8 10.0 -10 


loglage (yr)] 


9.6 10.2 


490 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers LI 


0 
Slope/error 


imited. All rights reserved 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 27 February; accepted 17 September 2015. 
Published online 16 November 2015. 


1. Fox, M. W. & Wood, P. R. Theoretical growth rates, periods, and pulsation 
constants for long-period variables. Astrophys. J. 259, 198-212 (1982). 

2. Wood, P.R. & Sebo, K. M. On the pulsation mode of Mira variables: evidence 
from the Large Magellanic Cloud. Mon. Not. R. Astron. Soc. 282, 958-964 (1996). 

3. Ita, Y. et al. Variable stars in the Magellanic Clouds — Il. The data and infrared 
properties. Mon. Not. R. Astron. Soc. 353, 705-712 (2004). 

4. Vassiliadis, E. & Wood, P. R. Evolution of low- and intermediate-mass stars to 
the end of the asymptotic giant branch with mass loss. Astrophys. J. 413, 
641-657 (1993). 

5. Tonry, J. & Schneider, D. P. A new technique for measuring extragalactic 
distances. Astron. J. 96, 807-815 (1988). 

6. Tonry, J. L. et al. The SBF survey of galaxy distances. IV. SBF magnitudes, 
colors, and distances. Astrophys. J. 546, 681-693 (2001). 

7. Blakeslee, J. P. et al. The ACS Fornax cluster survey. V. Measurement and 
recalibration of surface brightness fluctuations and a precise value of the 
Fornax-Virgo relative distance. Astrophys. J. 694, 556-572 (2009). 

8. van Dokkum, P. G. & Conroy, C. Fluctuation spectroscopy: a new probe of old 
stellar populations. Astrophys. J. 797, 56 (2014). 

9. Groenewegen, M. A. T. & Blommaert, J. A. D. L. Mira variables in the OGLE 
bulge fields. Astron. Astrophys. 443, 143-156 (2005). 

10. Soszyrski, |. et a/. The Optical Gravitational Lensing Experiment. The OGLE-II| 
catalog of variable stars. IV. Long-period variables in the Large Magellanic 
Cloud. Acta Astronom. 59, 239-253 (2009). 

11. Soszyrski, |. et a/. The Optical Gravitational Lensing Experiment. The OGLE-III 
catalog of variable stars. XV. Long-period variables in the Galactic bulge. 
Acta Astronom. 63, 21-36 (2013). 


LETTER 


12. Salpeter, E. E. The luminosity function and stellar evolution. Astrophys. J. 121, 
161-167 (1955). 

13. Waters, C. Z., Zepf, S. E., Lauer, T. R. & Baltz, E. A. Color bimodality in M87 
globular clusters. Astrophys. J. 693, 463-471 (2009). 

14. Peng, E. et al. The color-magnitude relation for metal-poor globular clusters in 
M87: confirmation from deep HST/ACS imaging. Astrophys. J. 703, 42-51 
(2009). 

15. Bird, S. et a/. The inner halo of M 87: a first direct view of the red-giant 
population. Astron. Astrophys. 524, A71 (2010). 

16. Conroy, S. & van Dokkum, P. G. The stellar initial mass function in early-type 
galaxies from absorption line spectroscopy. Il. Results. Astron. Astrophys. 760, 
71-87 (2012). 

17. Gould, A. Search for intracluster MACHOs by pixel lensing of M87. Astrophys. J. 
455, 44-49 (1995). 

18. Baltz, E. A. et al. Microlensing candidates in M87 and the Virgo Cluster with the 
Hubble Space Telescope. Astrophys. J. 610, 691-706 (2004). 

19. Ferrarese, L., Coté, P. & Jordan, A. Hubble Space Telescope observations of 
novae in M49. Astrophys. J. 599, 1302-1319 (2003). 


Acknowledgements We thank M. Groenewegen for discussions. C.C. thanks 
B. Holden and C. Rockosi for asking the question that provided the spark for 
this paper: ‘can one detect Mira variables in integrated light?’ 


Author Contributions C.C. constructed the models, led the data processing, 
and contributed to the analysis and interpretation. P.G.v.D. contributed to the 
analysis and interpretation. J.C. generated the stellar evolution models and 
contributed to the analysis and interpretation. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to C.C. 
(cconroy@cfa.harvard.edu). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 491 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Data reduction and tests. Owing to the small expected amplitude of the time- 
dependent flux signal for M87, great care was taken to control systematic effects. 
In this section we describe the details of the data reduction procedure and the 
additional corrections that were applied to the images. 

We began with the publicly available HST images, in which the four dithered 
exposures per visit were combined and astrometrically aligned, resampled by the 
drizzling process, and cosmic rays were removed. The public images include flat 
field corrections and a standard sky subtraction. We used five globular clusters to 
refine the astrometric alignment via subpixel shifts (using bilinear interpolation). 
The mean shift was 0.25 pixels (in both x and y directions) in the unbinned images. 
All of our analysis was performed on images binned 4 x 4, so these shifts are a tiny 
fraction of the final pixel size. 

Owing to the large angular extent of M87, the standard ACS pipeline is not able 
to accurately measure the true sky background. We therefore applied a correction 
to the sky subtraction. We assumed that the true M87 surface brightness profile is 
that reported by Kormendy”, which was derived by combining a variety of space 
and ground-based data. Using this profile, we estimated the sky background in our 
ACS images by minimizing the residuals between the ACS data and Kormendy’s 
profile with the sky background, normalization, and a linear colour gradient as free 
parameters (the last is to account for differences between our F814W filter and 
Kormendy’s V-band profile). This was done separately for each of the 52 images. 
We refer to this as the primary sky background correction. 

In order to test the fidelity of the images over the 72 d, we selected three back- 
ground galaxies and measured their fluxes within an 8-pixel aperture. These 
galaxies should show no detectable temporal variation. The resulting temporal 
variation of the total flux from these galaxies is shown in Extended Data Fig. la. 
There are no obvious time-dependent trends. However the scatter is 0.5%, which is 
relatively large compared to the signal of interest (of the order of 1%). We therefore 
made several additional modifications to the sky background levels in an effort 
to reduce the scatter. 

We identified three additional background galaxies (that is, not the ones used 
to measure the flux variation in Extended Data Fig. 1) and measured their flux 
variation over the duration of the observations. Under the assumption that these 
sources should have no intrinsic flux variation, we determined a sky background 
correction necessary to bring the flux of the background galaxies to a constant. The 
average correction determined this way was 0.002 counts s_'. At this point in the 
analysis, the distribution of pixel light curve slopes showed a slight preference for 
positive slopes (the mean slope/error was +-0.5). Under the assumption that the 
true distribution should have a mean of zero, we subtracted a linearly varying sky 
background component (which scaled as 5 x 10-°¢). These two corrections yield 
a distribution of pixel light curve slopes with zero mean (by construction), and a 
temporal flux variation in the three reference background galaxies with a scatter 
of 0.2% as shown in Extended Data Fig. 1b. Moreover, a 5-point boxcar average 
of the light curve of the background galaxies shows flux variation at the $0.1% 
level. From this test we conclude that it should be possible to measure intrinsic flux 
variation at the sub-percent level, at least for pixels where photon counting noise 
is not the dominant source of uncertainty. 

We emphasize that the additional sky background corrections discussed above 
do not materially change our conclusions. While these corrections result in a shift 
in the histogram of slope/error values, they have no effect on the width of the dis- 
tribution. Moreover, the example pixel light curves shown in Fig. 2 are unchanged 
within their 1c error bars. 

Approximately half of the exposures were obtained at a detector location offset 
by 60-70 pixels compared to the other half of the exposures. This provides a further 
test that the trends shown in Fig. 3 and the statistics in Fig. 4 are not dominated 
by unknown systematics at the level of the detector; if they were, one would have 
expected to see flux variation that correlated with the dither pattern, but such 
correlations, if present, are within the noise limits of the data. 

As a final test of both the data reduction and our results, in Extended Data 
Fig. 2 we show histograms of the flux variation over the 72-d observing window. 
The flux variation was computed by temporally binning the exposures by five to 
reduce the Poisson noise in the measurement and computing (maximum —mini- 
mum)/mean flux at each pixel. The results are shown for three bins of pixel fluxes 
(the legends show the cuts in units of counts per second and the total number 
of pixels per bin). The data (black lines) are compared to our best-fit model as 
derived from fitting the pixel light curve slopes (red lines) and a model without 
long-period variables (blue lines). The good agreement between the model and 
data in all three panels is a strong indication that our measurements are reliable, 
as the panels probe a factor of 40 in dynamic range in pixel fluxes. Systematic 
issues with, for example, the sky subtraction would show up most strongly in 
the pixels with low count rates, and yet the observations and models agree very 


well in that regime. Moreover, this flux variation metric is model independent 
and so the difference between the variable-star-free model and the observations 
provides further strong support that the variation detected in the observations 
is real and not an artefact of some unknown systematics. There do exist subtle 
differences between the model and data that vary as a function of the pixel flux, 
but this could be due to changes in the underlying stellar populations as the pixels 
with low fluxes are in the outskirts, where the ages and metallicities of the stars 
are expected to differ from the central regions. 

Modelling long-period variables. Here we provide additional details regarding the 
incorporation of long-period variables in the stellar population synthesis model- 
ling. We start with stellar isochrones that include all relevant evolutionary phases, 
including thermally pulsating asymptotic giant branch (AGB) stars. We include 
a model for circumstellar dust around these stars, which results in dimmer stars 
especially for the most intrinsically luminous and evolved stars”!. Periods (in days) 
are assigned according to the following equation: 


logP = —2.07log(R/R,) — 0.9log(M/M.) (1) 


which assumes that the stars pulsate in their fundamental mode’. Next, we require 
a relation between pulsation period and amplitude. This relation is shown in 
Extended Data Fig. 3 for stars in the Galactic bulge from OGLE data'!. Symbols 
are colour-coded according to the type of pulsator. The dashed lines are the adopted 
period-amplitude (P-A) sequences: 


logA = 0.5logP — 1.25 (2) 
for the Mira sequence (logP > 2.2), and: 
logA = 2logP — 5 (3) 


for the semi-regular variable (SRV) sequence (1.0 < logP < 2.2). In the equations 
above, P is in days and the amplitude A is in the I band in magnitudes. We note 
that the SRV sequence is included for completeness but has a very small effect on 
the model predictions. 

The equations above, along with the initial mass function weights determined 
by the masses of the AGB stars in the isochrones, completely specify our default 
variable-star model. In order to convert fluxes to luminosities, we have assumed 
a distance to M87 of 16.7 Mpc (ref. 7). In order to explore the constraining power 
of the data, we considered variation in both the amplitude of the long-period var- 
iables, implemented as an overall scaling of all the amplitudes by the same factor, 
and the weight given to the variable stars in the population synthesis. The latter 
can be interpreted as a change to the typical lifetime. 

We have taken great care to ensure that the long-period variable phase is well- 
resolved in the isochrone tables. The isochrones were constructed from 185 indi- 
vidual mass models and with 600 equivalent evolutionary points in the thermally 
pulsating AGB phase alone. We have run a variety of tests to ensure that our 
model predictions are ‘converged’; for example, we have created models with fewer 
evolutionary points and fewer input mass models and the resulting predictions 
are very similar. For context, at 10 Gyr our isochrones contain 350 points on 
the AGB with periods >200 d, while the publicly available Padova” isochrones 
contain only 3 such points. 

Extended Data Figure 4 quantifies the fractional contribution of long-period 
variables to the total flux of a stellar population as a function of wavelength, 
age, and metallicity ([Z/H]). The flux contribution peaks in the age range of 
10°°-10° yr and increases towards redder bands. The trend with wavelength is a 
reflection of the fact that variable stars are cool and so emit most of their light in 
the near-infrared. We caution that the wavelength-dependence shown here does 
not directly translate into the wavelength-dependence of the time-dependent 
signal because the period-amplitude relation also depends on wavelength. As 
the pulsation directly affects the radius and hence the temperature, for these cool 
stars one expects and indeed observes that the amplitudes are larger in the bluer 
wavebands”*. The metallicity-dependence is relatively modest, at least over the 
range [Z/H] = —0.3 to [Z/H] = +0.3, typical of massive galaxies. It is difficult to 
provide a simple explanation of the model metallicity variation, as it depends not 
only on the variable-star lifetime, luminosity, and temperature, but also on the 
properties of the underlying stellar population. 

We do not expect metallicity to play a critical role in the interpretation of the 
observations for several reasons. First, as noted in the previous paragraph, the 
models suggest a relatively weak metallicity-dependence of the long-period vari- 
able flux contribution. Second, M87 harbours a metallicity gradient™, extending 
from slightly super-solar in the inner R,/8 to slightly sub-solar at R., where R, is 
the effective radius. Despite this metallicity gradient, our best-fit model provides 
an equally good fit to the pixel shimmer statistics in both the central region and 
the outskirts, as shown in Extended Data Fig. 2. 


© 2015 Macmillan Publishers Limited. All rights reserved 


We note here that individual long-period variables have been detected in 

nearby galaxies including the Magellanic Clouds”, M31°°, and M327’. The most 
distant galaxy with secure detections of individual long-period variable stars is 
NGC 5128”%, and in this case the observations were confined to the outskirts where 
the stellar density was sufficiently low to permit the separation of the brightest 
evolved stars from the background sea of lower luminosity stars. These observa- 
tions of individual long-period variables should provide very useful constraints 
on the modelling of such stars, and we intend to make use of these constraints in 
future work. 
Trends with radius. The HST field of view covers the central 3.3’ x 3.3’ of M87, 
of which the inner ~1’ x 1’ has a signal-to-noise ratio S/N2Z 100 per pixel for the 
observations that were analysed herein. Kormendy” reports an effective radius 
of R.=3.2' so the region of the images with high S/N covers the inner ~0.3R.. 
Extended Data Figure 5 shows several important quantities as a function of R/R. 
for our best-fit model of M87. Extended Data Figure 5a shows the stellar mass per 
pixel for the underlying smooth stellar distribution. Extended Data Figure 5b 
shows the fraction of pixels with |slope/error| > 2. In the main text we reported 
that 24% of pixels reach this criterion, and in fact that percentage remains approx- 
imately constant with radius. The constancy is the result of two opposing effects: 
at larger radius the number of stars per pixel is lower, which implies a larger 
variable-star signal. The effect on the slope scales approximately as /N (here N 
is the number of stars per pixel) as multiple variable stars with random phases 
will cancel each other out in a central-limit-theorem-like process. However, at 
larger radius the S/N is lower, and for a fixed exposure time this also scales as 
/N. Thus, for a fixed exposure time, the detectability of long-period variables is 
fairly constant with radius. 

Extended Data Figure 5c and d show the model trends with radius for a noise- 
free model (infinite S/N). In this case it is clear that the absolute effect of long- 
period variables is larger at larger radius. Extended Data Figure 5c shows the 
fraction of pixels with >1% peak-to-peak flux variation over 200 d. Old stellar 
populations with a pixel mass <10°M,, yield >1% flux variation in ~10% of the 
pixels. Extended Data Figure 5d compares the surface brightness fluctuation (SBF) 
amplitude at a single epoch to the mean temporal variation over a 200-d baseline; 


LETTER 


the latter is smaller than the former by a factor of ~5. The SBF amplitude is com- 
puted as the standard deviation of the model flux divided by a smooth model for 
the flux. 

We close by noting that while the overall effect of long-period variables on the 
integrated light is relatively modest at old stellar ages, it is much more prominent 
for younger stellar populations, for example, in the 10°-10° yr range. Future work 
devoted to younger stellar populations will therefore probably uncover a rich array 
of observational signatures of time variable stellar populations. 

Code availability. We have opted not to make the code used in this manuscript 
available because the data reduction and analysis is fairly straightforward and can 
be easily reproduced following the methods described herein. 


20. Kormendy, J., Fisher, D. B., Cornell, M. E. & Bender, R. Structure and formation 
of elliptical and spheroidal galaxies. Astrophys. J. 182 (Supp.), 216-309 (2009). 

21. Villaume, A., Conroy, C. & Johnson, B. Circumstellar dust around AGB stars and 
implications for infrared emission from galaxies. Astrophys. J. 806, 82 (2015). 

22. Marigo, P. et al. Evolution of asymptotic giant branch stars. Il. Optical to 
far-infrared isochrones with improved TP-AGB models. Astron. Astrophys. 482, 
883-905 (2008). 

23. Smith, B. J., Leisawitz, D., Castelaz, M. W. & Luttermoser, D. Infrared light 
curves of Mira variable stars from COBE DIRBE data. Astron. J. 123, 948-964 
(2002). 

24. Kuntschner, H. et al. The SAURON project — XVII. Stellar population analysis of 
the absorption line strength maps of 48 early-type galaxies. Mon. Not. R. 
Astron. Soc. 408, 97-132 (2010). 

25. Wood, P. R., Bessell, M. S. & Fox, M. W. Long-period variables in the Magellanic 
Clouds — supergiants, AGB stars, supernova precursors, planetary nebula 
precursors, and enrichment of the interstellar medium. Astrophys. J. 272, 
99-115 (1983). 

26. Fliri, J., Riffeser, A., Seitz, S. & Bender, R. The Wendelstein Calar Alto Pixel- 
lensing Project (WeCAPP): the M 31 variable star catalogue. Astron. Astrophys. 
445, 423-439 (2006). 

27. Davidge, T. J. & Rigaut, F. Photometric variability among the brightest 
asymptotic giant branch stars near the center of M32. Astrophys. J. 607, 
L25-L28 (2004). 

28. Rejkuba, M., Minniti, D. & Silva, D. R. Long period variables in NGC 5128. 
|. Catalogue 2. Astron. Astrophys. 406, 75-85 (2003). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ei 
= 
a 
> 
aS 
0) 
> 
a=) 
isc) 
ce 
oO 
aml 
time (d) time (d) 
Extended Data Figure 1 | Flux of background galaxies. Shown is the average. a, Flux variation after the standard data reduction including the 
time variation of the flux of three background galaxies. The background primary sky background correction. The arrow indicates a point that lies 
galaxies should show no intrinsic time variation in their flux and therefore at —2.1. b, Flux variation after additional corrections were applied to the 
serve as a test of the stability of the data. The mean (u) and standard sky background levels. These additional corrections allow us to achieve a 


deviation (c) are reported in each panel. The loerror oneach point dueto stability of ~0.1% for boxcar-averaged time series data. 
photon counting uncertainty is 0.09%. The solid line is a 5-point boxcar 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


5<ct/s<10 N 10<ct/s<20 20<ct/s<200 
Niin=74790 \ N,ix=60604 N,in=53183 
:< 
mo) 
oO 
N 
= 
= 
io) 
= 
iN 
0 1 2 3 4 5 0 1 2 3 400 05 10 1.5 2.0 2.5 
flux variation (%) flux variation (%) flux variation (%) 
Extended Data Figure 2 | Flux variation distributions. Shown is the The data (black lines) are compared to the best-fit model (red lines) and 
normalized distribution of (maximum—minimum)/mean fluxes over the a noise-only model (blue dotted lines). Also shown in each panel is the 
72 observing windows, separated into three bins of counts per second (ct/s). number of pixels, Npixv contributing to the distribution. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


e Miras 
e SRVs 
e@ OSARGs 


— 
oN 
é 
Nae 
o 
Ke) 
=} 
— 
= 
e 
io] 
oN 
ie) 
— 
log period (d) 
Extended Data Figure 3 | Amplitude versus period for luminous shown as red, green, and blue symbols. Lines are the adopted sequences for 
variable stars. Data are for Galactic bulge stars from the OGLE survey"! Miras and SRVs; these relations are used to assign pulsation amplitudes in 
measured in the I band. The distinct classes of Miras, semi-regular our model. 


variables (SRVs), and OGLE small-amplitude red giants (OSARGs) are 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


[Z/H]=+0.0 ; ee [Z/H]=-0.3 
fo - [Z/H]=+0.0 
[Z/H]=+0.3 
5 0.10 
= 
— 
=) 
a) 
™ 
5 
S 
oO 
1S) 
* 
=) 
aS 
Aa, 0.01 
— 
log age (yr) log age (yr) 
Extended Data Figure 4 | Long-period variable (LPV) star flux scales approximately as t~/?. b, Flux contribution versus age and 
contribution versus age, wavelength, and metallicity. a, Fractional metallicity for the I and K bandpasses. The metallicity range shown 
contribution to the total luminosity versus age in four bandpasses: encompasses the observed variation in M87 within R.. 


1 (0.8m), z (0.9 um), J (1.2 um), and K (2.4m). The flux contribution 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


10’ 
model for M87 


Mass per pixel 
S. 


noise included 


—__ et 


fraction with |slope/errl>2 


fraction 


noise-free 


fraction with >1% flux variation 


fraction 


noise-free 


poral variation 


mean tem 


flux variation (%) 


0.0 0.1 0.2 0.3 


RIR, 
Extended Data Figure 5 | Radial variation of model properties for M87.__ peak-to-peak flux variation over 200 d. d, Strength of surface brightness 
a, Stellar mass per pixel for the smooth underlying model for M87 as a fluctuation (SBF) signal at a single epoch compared to the mean temporal 
function of R/R. where R, is the effective radius. b, Fraction of pixels with variation due to variable stars in a noise-free model. 


|slope/error| > 2. c, Fraction of pixels in a noise-free model with >1% 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature16137 


Collisionless encounters and the origin of the 


lunar inclination 


Kaveh Pahlevan! & Alessandro Morbidelli! 


The Moon is generally thought to have formed from the debris 
ejected by the impact of a planet-sized object with the proto-Earth 
towards the end of planetary accretion’. Models of the impact 
process predict that the lunar material was disaggregated into a 
circumplanetary disk and that lunar accretion subsequently placed 
the Moon in a near-equatorial orbit®-®. Forward integration of the 
lunar orbit from this initial state predicts a modern inclination 
at least an order of magnitude smaller than the lunar value—a 
long-standing discrepancy known as the lunar inclination 
problem’~°. Here we show that the modern lunar orbit provides a 
sensitive record of gravitational interactions with Earth-crossing 
planetesimals that were not yet accreted at the time of the Moon- 
forming event. The currently observed lunar orbit can naturally 
be reproduced via interaction with a small quantity of mass 
(corresponding to 0.0075-0.015 Earth masses eventually accreted 
to the Earth) carried by a few bodies, consistent with the constraints 
and models of late accretion!®". Although the encounter process 
has a stochastic element, the observed value of the lunar inclination 
is among the most likely outcomes for a wide range of parameters. 
The excitation of the lunar orbit is most readily reproduced via 
collisionless encounters of planetesimals with the Earth-Moon 
system with strong dissipation of tidal energy on the early Earth. 
This mechanism obviates the need for previously proposed (but 
idealized) excitation mechanisms!”", places the Moon-forming 
event in the context of the formation of Earth, and constrains the 
pristineness of the dynamical state of the Earth-Moon system. 

The Moon-forming impact is thought to have generated a compact 
circumplanetary disk (within ten Earth radii, Ry) of debris out of which 
the Moon rapidly accreted. Like Saturn's rings, the proto-lunar disk 
would be expected to become equatorial on a timescale that is rapid 
relative to its evolutionary timescale. Hence, so long as the proto-lunar 
material disaggregated into a disk following the giant impact, the Moon 
is expected to have accreted within about one degree of the Earth’s 
equatorial plane®. Tidal evolution calculations suggest that for every 
degree of inclination of the lunar orbital plane relative to the Earth’s 
equatorial plane at an Earth-Moon separation of 10Rg, the current 
lunar orbit would exhibit about half a degree of inclination relative 
to Earth’s orbital plane”. The modern lunar inclination of approxi- 
mately 5° would—without external influences—translate to an inclina- 
tion of about 10° to Earth’s equatorial plane at 10Rg shortly after lunar 
accretion. This approximately tenfold difference between theoretical 
expectations of lunar accretion and the observed Earth-Moon system 
is known as the lunar inclination problem. 

Previous work on this problem has sought to identify mechanisms 
such as a gravitational resonance between the newly formed Moon and 
the Sun” or the remnant proto-lunar disk! that can excite the lunar 
inclination to a level consistent with its current value. Neither of these 
scenarios is satisfactory, however, as the former requires particular 
values of the tidal dissipation parameters and the latter has only been 
shown to be viable in an idealized system in which a single, fully formed 
Moon interacts with a single pair of resonances in the proto-lunar disk. 


Moreover, previous works assumed that the excitation of the lunar orbit 
was determined during interactions that essentially coincided with 
lunar origin. Here, we propose that the lunar inclination arose much 
later as a consequence of the sweep-up of remnant planetesimals in the 
inner Solar System. 

After the giant impact and at most 10° years'*!°, the Moon has 
accreted, interacted with’? and caused the collapse of the remnant proto- 
lunar disk onto the Earth®, passed the evection resonance with the 
Sun*!*!6 and begun a steady outward tidal evolution. On a timescale 
(10°-107 years) that is rapid relative to that characterizing depletion 
of planetesimals in the final post-Moon-formation stage of planetary 
accretion!’ (called ‘late accretior’), the lunar orbit expands, owing to the 
action of tides, to an Earth-Moon separation of 20Rg—-40Rg. During this 
time, the lunar orbit transitions from precession around the spin axis of 
Earth to precession around the normal vector of the heliocentric orbit’, 
and its inclination becomes insensitive to the shifting of the Earth’s 
equatorial plane via subsequent accretion'®. However, as we show, lunar 
inclination becomes more sensitive to gravitational interactions with 
passing planetesimals as the tidal evolution of the system proceeds. The 
sensitivity is such that it renders the lunar orbital excitation a natural 
outcome of the sweep-up of the leftovers of accretion and yields a new 
constraint on the dynamical and tidal environment of the Earth-Moon 
system in the 10° years immediately following its origin. 

Although subsequent collisions with the Earth-Moon system have 
been previously considered as a mechanism for dynamical excitation'’, 
the collision of inner Solar System bodies with the Earth tends to be 
preceded by a large number (10°-10%) of collisionless encounters. 
Excitation via this process is governed by two relevant timescales: the 
timescale over which remnant populations in the inner Solar System 
are lost via accretion onto the planets and the Sun (several tens of mil- 
lions of years!”); and the timescale for the lunar tidal orbital expan- 
sion, which is a rapidly varying function of the Earth-Moon distance. 
The Earth-Moon distance is important because it determines the sys- 
tem cross-section for collisionless encounters with remnant bodies. 
Accordingly, the rate of tidal expansion of the lunar orbit during the 
first approximately 10° years after the giant impact is also important. 
As tidal evolution proceeds and the Earth-Moon separation increases, 
the system becomes increasingly susceptible to collisionless excitation, 
while populations capable of exciting the system are progressively 
depleted. A few tens of millions of years after the Moon-forming event, 
the Earth-Moon system reaches an optimal capacity for excitation via 
gravitational encounters: a dynamically excitable system co-existing 
with a substantial remnant-body population. 

Here, we run a series of Monte Carlo simulations to set constraints 
on the outcome of repeated encounters of massive bodies with the 
evolving Earth-Moon system. The simulations are carried out until 
the populations are exhausted either through collision with the Earth 
or through non-terrestrial loss channels (for details, see Methods). 
A sample run of dynamical excitation during the first approximately 
108 years of Earth-Moon history is shown in Fig. 1. No single event 
dominates: several strong encounters contribute substantially to the 


1Laboratoire Lagrange, Université Céte d’Azur, Observatoire de la Céte d’Azur, CNRS Boulevard de |’Observatoire CS 34229, 06304 Nice Cedex 4, France. 


492 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


) 
OANWHRADN 


50} x 


5 x 107 108 
Time (years) 


Modern 


Figure 1 | Sample realization. a, b, A model of the early lunar orbit 
subject to tidal evolution (k)/Q=0.1) and encounters leading to collision 
of two 0.00375Mg bodies with the Earth. The semi-major axis of the 
evolving lunar orbit a is given in Earth radii (shown in b). Although not 
every encounter increases the lunar inclination i, the cumulative effect 
shows a tendency towards excitation (shown in a). Notable interactions 
include merging collisions with the Earth kicking the lunar orbit via recoil 
(at 29.1 Myr and 31.5 Myr), several exceptionally close encounters with the 
Moon (at 7.3 Myr and 109.6 Myr) and the exhaustion of the population 
(at 141.7 Myr) ultimately marking the end of the simulation. Subsequent 
inclination damping owing to planetary tides is modest (from 5.8° at 47Rg 
at the end of the simulation to 5.4° at 60R,), a feature that is typical of this 
‘late’ excitation mechanism. 


final excitation. The size distribution of the late-accreting population 
is assumed to be top heavy, with most of the mass contained in a few 
massive bodies, as has been previously proposed to explain terrestrial 
late accretion’®"!. This particular simulation ultimately results in two 
0.00375Mg planetesimals (where Mg is Earth’s mass) left over from the 
formation process colliding with the Earth. Tidal damping of the lunar 
inclination is applied along with the lunar orbital expansion, following 
equation (1) (see Methods). We do not consider the possibility that the 
lunar inclination might have been more strongly damped via dissipa- 
tion in the lunar magma ocean, as recently proposed’’. In the Methods, 
we show that this effect is not important as long as the lunar magma 
ocean crystallized within a few tens of millions of years. 

Lunar orbital excitation in this epoch depends on the total mass 
of leftover planetesimals, the number of bodies carrying the mass 
and their orbital distribution, the rate of terrestrial tidal dissipation, 
and a stochastic element. In Fig. 2, we show results of simulations of 
the excitation of the lunar inclination due to interaction of the sys- 
tem with a small amount of mass (equivalent to 0.0075M;-0.015Mz 
eventually accreted to the Earth), for different values of the strength 
of tidal dissipation and the number of bodies delivering the mass 
(which is constrained to be <20 colliding with Earth via models of 
late accretion!®1!), 

Several features are apparent. First, there is a quasi-linear dependence 
of the excitation on the total mass of late accretion: other variables 
being equal, excitation corresponding to 0.015Mg of late accretion is 
approximately twice as great as that with 0.0075Mg. The mass accreted 
onto the Earth thereby provides a proxy for collisionless excitation. 
Second, the lunar orbital excitation exhibits some dependence on the 
strength of tidal evolution: stronger dissipation within the Earth drives 
the lunar orbit outwards faster and exposes the system to more colli- 
sionless events. Simulations with the weakest tides that we considered— 
characterized by the ratio of the tidal Love number to the specific dissi- 
pation function kz/Q=0.01—typically excite the lunar inclination with 
a planetesimal population consistent with 0.015Mg of late accretion, 
whereas with stronger tidal dissipation (kp/Q=0.1), lunar inclination 
is routinely excited by a planetesimal population carrying 0.0075Mg of 
late accretion. Third, there exists a negative dependence of the excita- 
tion on the number of bodies involved in late accretion, such that the 


LETTER 


25 
1 H 
20+ 1 2 
15} | 3 
c ee : a 
~ 1 ; ‘ 
10; ; : 1 2 
1 9 ; ja 
1 4 ‘Lunar ! 2 
5} | ti : —__ $ 
0 fn eye ‘ a ae 
0.01 7 0.1 


Strength of tides, k,/Q 


Figure 2 | Summary of simulations. Median values (symbols), and lo 
(solid lines) and 2c (dashed lines) intervals for the lunar inclination i at 

the end of the simulations after damping via planetary tides to the modern 
Earth-Moon separation. The excitation in the modern lunar orbit is plotted 
for comparison, indicated by the horizontal line labelled ‘lunar’. Diamonds 
correspond to simulations with 0.0075Mg accreted to Earth; squares 
correspond to 0.015Mg. ‘Strong’ tidal dissipation (k,/Q=0.1) corresponds 
to a hot dissipative silicate Earth, and ‘weak dissipation (k2/Q=0.01) 
represents the geologic average value dominated by dissipation in shallow 
oceans. In these simulations, the accretion of 0.0075Mg (with ‘strong’ tides) 
to 0.015Mg (with ‘weak tides) frequently reproduces the excitation in the 
lunar orbit. The number of bodies delivering the late accreted mass in each 
set of simulations is reported above each symbol. 


mechanism requires a population that is top heavy (with most of the 
mass delivered via the most massive bodies). For a given mass of late 
accretion, a greater number of bodies also renders the distribution of 
lunar inclinations more strongly peaked and predictions of the expected 
excitation more precise. Despite an order of magnitude of uncertainty 
in the strength of early terrestrial tides (k2/Q) and in the number of 
bodies involved in the leftover population, and the stochasticity that is 
inherent in collisionless encounters, close encounters with a popula- 
tion of planetesimals delivering 0.0075Mg-0.015Mg to the Earth after 
the Moon-forming event can robustly reproduce the excitation that 
characterizes the lunar orbit. 

The angular momentum of the Earth-Moon system at the time of 
its origin is a central feature diagnostic of various proposed giant- 
impact scenarios!?*'8, Given that the lunar orbit provides a sensitive 
dynamical measure of encounters with the Earth after the origin of the 
Moon, we ask whether such gravitational interactions were effective in 
injecting or extracting angular momentum. Figure 3 summarizes the 
angular momentum change versus the final excitation. The change in 
angular momentum corresponding to the modern inclination excita- 
tion of approximately 5° is probably a few tens per cent or less. Hence, 
the standard giant-impact scenario‘ followed by little subsequent 
dynamical modification is compatible with the dynamical state of the 
modern system, whereas a high-angular-momentum impact scenario*” 
would require another dynamical mechanism such as the evection res- 
onance*'*'* to be reconciled with the modern Earth-Moon system. 

The sensitivity of the orbits of impact-generated satellites to ongoing 
accretion onto the host planet has several consequences. The degree 
of orbital excitation resulting from interaction with, and accretion of, 
0.0075Mg-0.015Mg onto the early Earth suggests that collisionless 
encounters with massive bodies—such as the Moon-to-Mars mass 
embryos thought to have played a key role in the accretion of the 
Earth—would have excited satellites to very eccentric orbits ulti- 
mately leading to their dynamical loss, either via collision with the 
host planet or liberation into heliocentric orbit. Such excitability of 
impact-generated satellite orbits may explain several features of the 
inner Solar System that have yet to be understood. For example, despite 
impact-generated satellites being a quasi-generic feature of terrestrial 
planet formation via giant impact, the absence of an impact-generated 


26 NOVEMBER 2015 | VOL 527 | NATURE | 493 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


0.3 


15 20 


Figure 3 | Angular momentum change of the Earth-Moon system. 
Median values (symbols), and 1c (solid lines) and 2c (dashed lines) 
intervals for inclination i and angular momentum L, change via post-lunar 
collisionless encounters. Diamonds represent realizations with weak tides 
(k,/Q=0.01) and 0.0075Mg accretion; squares correspond to realizations 
with stronger dissipation (k2/Q=0.1) and 0.015Mg accretion, bracketing 
the range in our simulations. Each suite of simulations is composed of two 
subsets: one with late accretion delivered via one body (greater excitation); 
the other, four bodies (lesser excitation). Intermediate outcomes with two 
accreted bodies are omitted for clarity. For a level of excitation consistent 
with the modern lunar orbit (5.15°), the amount of system angular 
momentum change is probably <20%. However, the 2c intervals for the 
strongest excitation case plotted extend to Ai= 42° and |AL,/L,| = 0.48, 
implying a small probability (<5%) for angular momentum change >50%. 


satellite around Venus” and the apparent absence of a pre-Moon ter- 
restrial satellite’! can be understood: any such early-formed satellites 
would have been lost via encounters with extant planetary embryos, 
including perhaps the Moon-forming impactor itself. Moreover, the 
occurrence of the Moon-forming giant impact late in the history of 
Earth accretion can be understood as a necessity for its survival: even 
satellites generated moderately earlier would have been readily dynam- 
ically destabilized. Just as the survival of planets depends on the sur- 
rounding stellar environment”, the survival of an impact-generated 
satellite depends on the planetary environment at the time of origin. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 23 March; accepted 6 October 2015. 


1. Cameron, A. G. W. & Ward, W. R. The origin of the Moon. Lunar Planet. Sci. Cont. 
7, abstr. 120-122 (1976). 

2. Hartmann, W. K. & Davis, D. R. Satellite-sized planetesimals and lunar origin. 
Icarus 24, 504-515 (1975). 


494 | NATURE | VOL 527 | 26 NOVEMBER 2015 


3.  Cuk, M. & Stewart, S. T. Making the Moon from a fast-spinning Earth: 

a giant impact followed by resonant despinning. Science 338, 1047-1052 
(2012). 

4. Canup, R. M. & Asphaug, E. Origin of the Moon in a giant impact near the end 
of the Earth’s formation. Nature 412, 708-712 (2001). 

5. Canup, R. M. Forming a Moon with an Earth-like composition via a giant 
impact. Science 338, 1052-1055 (2012). 

6. Ida, S., Canup, R. M. & Stewart, G. R. Lunar accretion from an impact-generated 

disk. Nature 389, 353-357 (1997). 

rf ignard, F. The lunar orbit revisited, Ill. Moon Planets 24, 189-207 (1981). 

8. Goldreich, P. History of lunar orbit. Rev. Geophys. 4, 411-439 (1966). 

9. Touma, J. & Wisdom, J. Evolution of the Earth-Moon system. Astron. J. 108, 

943-1961 (1994). 

10. Bottke, W. F., Walker, R. J., Day, J. M. D., Nesvorny, D. & Elkins-Tanton, L. 

Stochastic late accretion to Earth, the Moon, and Mars. Science 330, 

527-1530 (2010). 

11. Raymond, S. N., Schlichting, H. E., Hersant, F. & Selsis, F. Dynamical and 

collisional constraints on a stochastic late veneer on the terrestrial planets. 

Icarus 226, 671-681 (2013). 

12. Touma, J. & Wisdom, J. Resonances in the early evolution of the Earth-Moon 

system. Astron. J. 115, 1653-1663 (1998). 

13. Ward, W. R. & Canup, R. M. Origin of the Moon’s orbital inclination from 

resonant disk interactions. Nature 403, 741-743 (2000). 

14. Thompson, C. & Stevenson, D. J. Gravitational instability in two-phase 
disks and the origin of the Moon. Astrophys. J. 333, 452-481 
(1988). 

15. Salmon, J. & Canup, R. M. Lunar accretion from a Roche-interior fluid disk. 
Astrophys. J. 760, 83 (2012). 

16. Wisdom, J. & Tian, Z. L. Early evolution of the Earth-Moon system with a 
fast-spinning Earth. Icarus 256, 138-146 (2015). 

17. Morbidelli, A., Marchi, S., Bottke, W. F. & Kring, D. A. A sawtooth-like timeline for 
the first billion years of lunar bombardment. Earth Planet. Sci. Lett. 355-356, 
144-151 (2012). 

18. Canup, R. M. Dynamics of lunar formation. Annu. Rev. Astron. Astrophys. 42, 
441-475 (2004). 

19. Nimmo, F. & Chen, E. M. A. Tidal dissipation in the early lunar magma ocean 
and its role in the evolution of the Earth-Moon system. 45th Lunar Planet. Sci. 
Conf. abstr. 1459 (2014). 

20. Alemi, A. & Stevenson, D. J. Why Venus has no moon. Bull. Am. Astron. Soc. 38, 
491 (2006). 

21. Canup, R. M. Lunar-forming impacts: processes and alternatives. Philos. Trans. 
R. Soc. London Ser. A 372, 20130175 (2014). 

22. Spurzem, R., Giersz, M., Heggie, D. C. & Lin, D. N. C. Dynamics of planetary 
systems in star clusters. Astrophys. J. 697, 458-482 (2009). 


Acknowledgements This research was carried out as part of a Henri Poincaré 
Fellowship at the Observatoire de la Cote d’Azur (OCA) to K.P. The Henri 
Poincaré Fellowship is funded by the OCA and the City of Nice, France. 

A.M. thanks the European Research Council Advanced Grant ACCRETE 

(no. 290568). 


Author Contributions K.P. and A.M. discussed every step of the project, 
designed the simulation set-up and co-wrote the numerical code. 
K.P. performed the simulations and the statistical analysis. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
K.P. (pahlevan@oca.eu). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


A large number of simulations (about 10°) are required to characterize the distribu- 
tion of outcomes for repeated encounters of a given planetesimal population with 
a given early Earth-Moon system. Accordingly, we design a numerical experiment 
that captures the physics of the problem statistically and that can be computed 
efficiently. Heliocentric orbits for late-accreting planetesimal populations were 
generated according to a Rayleigh distribution with a Rayleigh eccentricity eg =0.3 
and inclination ig = ep/2; these values are consistent with simulations of terrestrial 
planet formation’. To test for sensitivity to population orbits, we varied ex between 
0.3 and 0.4; the resulting median inclination excitation changed by less than 10%. 
With given orbital distributions, the subset of the population that is Earth-crossing 
was selected and encounter probabilities with the Hill radius (Ry) of the Earth were 
calculated according to expressions given in ref. 24. The masses of the planetesimals 
are assumed to be in the range 0.15M,-1.2M, (My is the lunar mass), consistent 
with those expected for the projectiles carrying the Earth’s late accretion’. At the 
beginning of the simulations, the Earth and the Moon were placed on circular 
uninclined orbits with radii of 1 au and 5Rg, respectively, near their orbits at the 
end of accretion and the beginning of tidal history. An encounter time and encoun- 
ter orbit were chosen randomly according to the distribution of Earth-crossing 
planetesimal encounter probabilities. At the time of each encounter, phases for 
the lunar orbit, characterized by the argument of perigee (w), the longitude of 
the ascending node ({2) and the mean anomaly (M) were selected randomly, as 
was the orientation of the planetesimal orbit, within those orientations admitted 
by the selected orbital parameters. The impact parameter (b) was selected in 
the interval [0, Ry] according to a uniform encounter probability per unit area 
(dP x bdb). Gravitational three-body (Earth-Moon-planetesimal) encounters 
were integrated with a Bulirsch—Stoer integrator (included in the SWIFT package) 
in a geocentric reference frame, tracking changes to the lunar orbit. In between 
three-body encounters, the eccentricity (e) and semi-major axis (a) of the lunar 
orbit were evolved with a constant-Q tidal model”? while the lunar inclination was 
evolved with a model” for planetary tides: 


i 4a (1) 
Impacts with the Earth were assumed to be inelastic merging events with the final 
body carrying the total mass and momentum. Impacts onto the Moon would have 
been in the erosive and/or catastrophic disruption regime and realizations with 
such events were removed from the subsequent analysis (discussed below). 

Remnant planetesimals can, in general, be eliminated via collision with the ter- 
restrial planets and the Sun or dynamical ejection from the inner Solar System". 
We characterize such losses using the outcome of direct N-body simulations that 
trace the evolution of such early planetesimals, yielding a tenfold depletion of the 
Earth-crossing population in the first 100 Myr, which corresponds to a population 
decay law of approximately exp(—t/Tss), where f is time and 7g is the time con- 
stant for the decay of the population (approximately 44 Myr) (ref. 17). Although 
the modern near-Earth-object population is resupplied by the asteroid belt in a 
quasi-steady state fashion and effectively does not decay, the leftover planetesi- 
mal population is not resupplied by a larger reservoir and therefore does decay. 
Owing to partial resupply of Earth-crossing bodies, the timescale for the decay 
of the Earth-crossing population can nevertheless be different to the lifetime of 
individual particles. To integrate the decay rates of Earth-crossing planetesimals 
using our simulations, we use the following procedure. After generating orbital 
populations, but before running three-body integrations, we allow the Earth- 
crossing populations to encounter the Earth alone, which permits derivation of 
a time constant for decay of this population solely via collision with the Earth 
(TE=79 Myr). Next, we require that the Earth-crossing planetesimal population 
in our three-body simulations decay at the same average rate as that observed in 
the N-body heliocentric simulations. We therefore decompose average loss rates 
of the Earth-crossing population into terrestrial and non-terrestrial loss modes 
(1/Tss= 1/Tg+ 1/TNg), and thereby derive a time constant (TNE = 99 Myr) for 
removal via non-terrestrial loss channels. Accordingly, we stochastically remove 
bodies from the population in our three-body simulations such that the average 
loss rate of Earth-crossing planetesimals—through collision with Earth (explicit) 
as well as through other modes of loss (implicit)—is consistent with the aver- 
age loss rates observed in N-body simulations of late accretion!” (see Extended 
Data Fig. 1a). 

Each data point in Fig. 2 is the result of 4,000 realizations. To analyse the results, 
certain realizations were eliminated. These realizations fall into three categories: 
those that result in (1) a collision of a planetesimal with the Moon; (2) dynamical 
loss of the Moon; and (3) too large or too small a mass accreted by the Earth. 

(1) Occasionally, one of the planetesimals in our simulations collides with the 
Moon rather than with the Earth. For simulations that deliver the late-accreted 


LETTER 


mass to Earth via one, two and four planetesimals, the fraction of realizations 
in which such an event takes place is 9%, 15% and 25%, respectively. Given 
the masses that we assume for the planetesimals (0.15M,-1.2M_), it is doubt- 
ful that the Moon ever experienced such a massive collision. The largest lunar 
impact for which we have clear and unambiguous evidence is the South Pole 
Aitken Basin event (an approximately 10* erg event’), which, at the encounter 
velocities considered here (see Extended Data Fig. 1b), corresponds to a lunar 
impactor that is 3-4 orders of magnitude less massive than the planetesimals 
in our populations. Although most basin-forming impacts are thought to have 
occurred during the late accretion era’, the effect of these impacts on the lunar 
inclination was minor’. 

(2) Certain realizations, particularly those that correspond with the strongest 
tides and largest amount of interacting mass, generate excitations in eccentricity 
that are sufficient to destabilize the satellite orbit. Hence, collision with the host 
planet or (more rarely) liberation of the satellite into heliocentric orbit follows. 

(3) The total amount of planetesimal mass at the start of simulations was 
chosen such that, on average, the mass accreted by the Earth would be 0.0075Mg 
or 0.015Mg, hereafter denoted the ‘target mass. Given the stochasticity inherent 
to this problem, the accreted mass varies between realizations, resulting in a 
distribution of outcomes centred on the target mass. To facilitate the expression 
of the results in terms of late-accreted mass, we eliminate from the subsequent 
analysis those realizations whose accreted mass is greater than or less than the 
target mass. 

Differential momentum transfer is the process underlying this excitation 
mechanism. For simplicity, we describe this process for the case of a planetes- 
imal colliding with the Earth, but the general case of a collisionless encounter is 
similar. The orbit of the Earth and Moon around their common centre of mass 
is defined by their relative position and velocity. A third body encountering the 
Earth-Moon binary must have an orbit that crosses the system’s heliocentric orbit, 
and approaches it with some finite velocity at large separation. Hence, the delivery 
of mass onto the Earth is accompanied by the delivery of external momentum 
that—in the impulse approximation—changes the relative velocity, but not the 
relative position, of the Earth with respect to the Moon, altering the mutual orbit, 
a hitherto overlooked effect that can excite the lunar inclination and eccentricity. 

We ask whether the satellite excitation is dominated by the few strongest 
encounters or by the much more numerous distant ones. Theory suggests that 
for top-heavy perturber populations, the few strongest perturbations dominate 
over the more numerous weak ones”*. We test this theory for our simulations by 
generating realizations where the impact parameter is chosen in the interval 
(0, Ruy], [0, Ru/2] and [0, Ry/4], progressively eliminating a large number of distant 
encounters. The results of the three simulations are statistically indistinguishable 
(see Extended Data Fig. 2a), confirming the theoretical expectation. To facilitate the 
reproducibility of our results, we plot a measure of the strength of a perturbation 
against the impact parameter of the encounters for two individual simulations 
(Extended Data Fig. 2b, c). 

The simulations are permitted to proceed until the population of Earth-crossing 
bodies are exhausted either through collision with the Earth or through loss via 
another channel in accordance with an average rate (see above). At the end of the 
simulations, which characteristically last about 10° years, the lunar orbit is typically 
at about 40Rg. To compare the simulation outcomes to the modern system, we 
propagate the lunar orbit forward to its current separation at 60Rg and permit the 
inclination to decay in accordance with the action of planetary tides (equation (1)). 
The number of simulations was chosen such that the median inclination values 
vary by only several per cent. 

Recently, it has been suggested that the lunar inclination could have been 
damped via obliquity tides in the lunar magma ocean (LMO) as the Moon’s 
obliquity increased during its approach to the Cassini state transition between 
20Rz and 30Rg (ref. 19). The authors of ref. 19 put forward one interpretation of 
the current excited state of the lunar inclination: that the inclination was excited 
early”, but that the rate of tidal dissipation in the post-giant-impact Earth 
was sufficiently low to delay passage of the lunar orbit through the Cassini state 
transition until after the crystallization of the LMO, at which point the effect 
of obliquity tides on the lunar inclination becomes much less. With the ‘late’ 
mechanism of lunar orbital excitation described here, we identify a different 
solution: that the rate of tidal dissipation on the post-impact Earth is sufficiently 
rapid in the first tens of millions of years to carry the Moon through the Cassini 
state transition and to damp any early acquired lunar inclination. Following 
such a resetting episode, the LMO crystallizes, and inclination excitation due 
to encounters is subsequently preserved. To explore this solution, we ran a suite 
of simulations in which the inclination is reset to zero until a certain time, and 
permitted to accumulate excitation subsequently (see Extended Data Fig. 3). 
Such a transition marks the time of crystallization of the LMO. It can be seen 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


that as long the duration of the LMO crystallization is sufficiently short, such 
a solution is viable and, indeed, necessary in a tidal evolution scenario recently 
described”’. 

Code availability. The code used to conduct these simulations is available by 
request from the authors. 


23. Walsh, K. J., Morbidelli, A. Raymond, S. N., O’Brien, D. P. & Mandell, A. M. A low 
mass for Mars from Jupiter’s early gas-driven migration. Nature 475, 206-209 
(2011). 

24. Wetherill, G. W. Collisions in the asteroid belt. J. Geophys. Res. 72, 2429-2444 
(1967). 


25. 
26. 


2]. 
28. 
29. 
30. 


Yoder, C. F. & Peale, S. J. The tides of lo. Icarus 47, 1-35 (1981). 

Kaula, W. M. Tidal dissipation by solid friction and the resulting orbital 
evolution. Rev. Geophys. 2, 661-685 (1964). 

Wieczorek, M. A., Weiss, B. P. & Stewart, S. T. An impactor origin for lunar 
magnetic anomalies. Science 335, 1212-1215 (2012). 

Collins, B. F. & Sari, R. Levy flights of binary orbits due to impulsive encounters. 
Astron. J. 136, 2552-2562 (2008). 

Zahnle, K. J., Lupu, R., Dobrovolskis, A. & Sleep, N. H. The tethered Moon. Earth 
Planet. Sci. Lett. 427, 74-82 (2015). 

Marchi, S., Bottke, W. F., Kring, D. A. & Morbidelli, A. The onset of the lunar 
cataclysm as recorded in its ancient crater populations. Earth Planet. Sci. Lett. 
325-326, 27-38 (2012). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Number of bodies 


10 20 30 40 50 
Time (Myrs) 


Extended Data Figure 1 | Properties of planetesimal populations. 

a, Decay rates of Earth-crossing planetesimal populations according 

to N-body simulations of the inner Solar System with a resonant (3:2) 
Jupiter and Saturn at 5.4 au and 7.2 au, respectively. Different colours 
represent the number of Earth-crossing bodies (hence the evolution is 
not monotonic) in different simulations, from recent integrations’”. The 
black line is the prescribed decay rate used in the three-body simulations 
(7 ss = 44 Myr) to match the decay rate in heliocentric simulations. 


N (arbitrary units) 


LETTER 


140 
120 


100 


0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 
Approach velocity (km/s) 


b, Histogram of implemented approach velocities (before acceleration due 
to Earth gravity) for late-accreting populations in three-body simulations. 
The population is generated using a Rayleigh distribution of eccentricities 
and inclinations (eg = 0.3, ig = er/2) and a semi-major axis range 
(a=0.8-1.4 av) that produces Earth-crossing orbits. These parameters are 
motivated by simulations of terrestrial planet formation, but the peak of 
the distribution (9 km s~') corresponds to the typical encounter velocity 
inferred for lunar basin-forming impactors*’. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


log (IAhI/Ihl) 


20 
15 
© 
£ 10 
oO 
& 
Oo 
£ 
5 
0 


Maximum impact parameter 


log (IAhI/Ih!) 


Extended Data Figure 2 | Tests and outcomes for reproducibility. a, Test 
of the cumulative effect of repeated encounters: median values (squares), 
and 1a (solid lines) and 2c (dashed lines) intervals for three suites of 
simulations with 0.0075Mg accreted via a single body onto an Earth with 
strong dissipation (k,/Q=0.1). Each suite of simulations consists of 
incoming planetesimals with impact parameters (b) ranging from 0 to Ry, 
Ry/2 and Ry/4 (as indicated by the values given above each simulation 
result). The statistical similarity of the resultant distributions shows that 
distant encounters have a far smaller effect on inclination excitation than 
do the rare and strong close encounters. b, ¢, Distribution of ‘kicks’ versus 
impact parameter (b) of encounters from two different realizations. 


¢ 

ho 
oe 3 2 vo O3 
+ 


Impact parameter (Earth radii) 


Impact parameter (Earth radii) 


The change in the angular momentum vector of the satellite (A) owing to 
encounter torques is normalized to the magnitude of the orbital angular 
momentum before the encounter (h). The planetesimals approaching the 
Earth-Moon system in this simulation have a mass of 0.0075Mg. Approach 
velocities are selected from the distribution plotted in Extended Data 

Fig. 1b. The cumulative effects of encounters with b > 60Rg are negligible 
and therefore neglected. Panel b (c) shows data from a realization that lasts 
26.2 Myr (45.3 Myr) and results in a satellite with a final inclination of 
i=1.9° (i=8.8°). The strength of tidal dissipation used here (k»/Q=0.1) 
quickly results in a satellite semi-major axis of a= 30Rg-40Rg. 


© 2015 Macmillan Publishers Limited. All rights reserved 


cS 10 

i= 

S 

3 

£ 

Oo 

£ 5 
) 


0 50 100 
Cessation time of damping (Myrs) 


Extended Data Figure 3 | Effect of partial damping due to LMO 
obliquity tides. Median values (squares), and 1a (solid lines) and 20 
(dashed lines) intervals for several suites of partially damped simulations. 
These simulations consist of an accreted mass of 0.0075Mg delivered 

via a single body onto a strongly dissipative (k;/Q=0.1) Earth, with the 
orbital excitation continuously reset (e=0, i=0) until a certain time and 
permitted to accumulate subsequently. Such a transition represents LMO 
crystallization and the cessation of inclination damping via obliquity 
tides!’. These simulations are representative of excitation behaviour for 
partially damped cases. It can be seen that, so long as the crystallization of 
the LMO is sufficiently rapid (about 10’ years), excitation via planetesimal 
encounters is largely unaffected. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


doi:10.1038/nature15768 


Type-II Weyl semimetals 


Alexey A. Soluyanov!, Dominik Gresch!, Zhijun Wang’, QuanSheng Wu!, Matthias Troyer!, Xi Dai? & B. Andrei Bernevig? 


Fermions—elementary particles such as electrons—are classified as 
Dirac, Majorana or Weyl. Majorana and Weyl fermions had not been 
observed experimentally until the recent discovery of condensed 
matter systems such as topological superconductors and semimetals, 
in which they arise as low-energy excitations'-°. Here we propose 
the existence of a previously overlooked type of Weyl fermion that 
emerges at the boundary between electron and hole pockets in a new 
phase of matter. This particle was missed by Weyl’ because it breaks 
the stringent Lorentz symmetry in high-energy physics. Lorentz 
invariance, however, is not present in condensed matter physics, 
and by generalizing the Dirac equation, we find the new type of 
Weyl fermion. In particular, whereas Weyl semimetals—materials 
hosting Weyl fermions—were previously thought to have standard 
Weyl points with a point-like Fermi surface (which we refer to as 
type-I), we discover a type-II Weyl point, which is still a protected 
crossing, but appears at the contact of electron and hole pockets in 
type-II Weyl semimetals. We predict that WTe, is an example of 
a topological semimetal hosting the new particle as a low-energy 
excitation around such a type-II Weyl point. The existence of type-II 
Weyl points in WTe2 means that many of its physical properties are 
very different to those of standard Weyl semimetals with point-like 
Fermi surfaces. 

The band structure of some metals has non-trivial topological fea- 
tures”, Of such metals, the ones with vanishingly small density of states 
at the Fermi level—semimetals—stand out. For these materials, a dis- 
tinction between topologically protected surface states and bulk metal- 
lic states can be made and their Fermi surfaces can be topologically 
characterized, unlike the case for metals, which have many states at the 
Fermi level. Two kinds of topological semimetals have attracted spe- 
cial attention: Dirac and Weyl semimetals. In these materials, a linear 
crossing of two (Weyl) or four (Dirac) bands occurs at the Fermi level 
(see Fig. 1a). The effective Hamiltonian for these crossings is given by 
the Weyl or gapless-Dirac equation, respectively. The Weyl crossings 
are protected from gapping, owing to the massless nature of the Weyl 
fermion. In the following, we limit the discussion to Weyl crossings 
only, although our results also hold for Dirac crossings. 

The appearance of Weyl points (WPs) is possible only if the product 
of parity and time reversal is not a symmetry of the structure. When 
present, a WP acts as a topological charge—either a source or a sink of 
Berry curvature. A Fermi surface enclosing a WP has a well-defined 
Chern number, corresponding to the topological charge of this WP. 
Because the net charge must vanish in the entire Brillouin zone, WPs 
always come in pairs; they are stable to weak perturbations and are 
annihilated only in pairs of opposite charge. A large number of unusual 
physical phenomena are associated with Weyl topological semimet- 
als, including the existence of open Fermi arcs in the surface Fermi 
surface’ and various magnetotransport anomalies”, 

Weyl semimetals with broken time-reversal symmetry have been 
predicted to exist in several materials!!°'’, but these predictions have 
yet to be experimentally verified. More recently, the Weyl semimetal 
was predicted to exist in inversion-breaking single-crystal non- 
magnetic materials of the TaAs class**; this prediction has since been 
verified experimentally*®. 


Weyl semimetals were previously thought to have a point-like 
Fermi surface at the WP. We refer to these as type-I WPs (WP1s), to 
distinguish them from the new type-II WPs (WP2s) that exist at the 
boundaries between electron and hole pockets, as illustrated in Fig. 1b. 
We discuss general conditions for WP2s to appear, and present evi- 
dence that WTe,—the material with the largest never-saturating 
magnetoresistance reported’® so far—is an example of the new type 
of topological semimetal hosting eight WP2s. These WP2s come in 
two quartets located 0.052 eV and 0.058 eV above the Fermi level. 
We present topological arguments that prove the existence of the 
new topological semimetal phase in WTe2. We provide evidence of 
doping-driven topological Lifshitz transitions, which are characteristic 
of WP2s, as well as emerging Fermi arcs in the surface Fermi surface. 

We start by considering the most general Hamiltonian describing 
a WP 


H(k)= > k Ajo; 
i=x,y,Z 
J=0,x,y,z 


where k is the wave vector in reciprocal space (crystal momentum vec- 
tor), A is a3 x 4 matrix of coefficients, a is the 2 x 2 unit matrix and 
and oj, j=x, y, z are the three Pauli matrices. The energy spectrum is 


2 


= T(k)+ U(k) 


kA, 


i=x,y,z 


ex(k)= D7 k Apt, DO | 


i=x,y,Z JHXy Zz 


(1) 


where T(k) and U(k) can be considered as the kinetic and potential 
components of the energy spectrum. T(k), which is linear in momen- 
tum, tilts the cone-like spectrum ¢+(k). This tilt breaks the Lorentz 
invariance of Weyl fermions in quantum field theory, but was previously 
considered unimportant. However, because Lorentz invariance does not 
need to be respected in condensed matter, its inclusion is important and 
leads to a finer classification of distinct Fermi surfaces, in correspond- 
ence with the theory of quadric surfaces, which suggests that there are 
exactly two distinct types of WPs (see Supplementary Information). 

If, for a particular direction in reciprocal space, T is dominant over 
U, the tilt becomes large enough to cause a WP to appear at the point 
where the open electron and hole pockets touch, contrary to the stand- 
ard case of a point-like Fermi surface. Thus, the condition for a WP to 
be of type II is that there exists a direction k for which T(k) > U(k). If 
such a direction does not exist, then the WP is of type I. The clear 
qualitative distinction between the Fermi surfaces of the two types of 
WPs leads to marked differences in the thermodynamics of the hosting 
materials and their response to magnetic fields. In particular, in con- 
trast toa WPI, which exhibits a chiral anomaly? for any direction of 
the magnetic field, the chiral anomaly appears in a WP2 only when the 
direction of the magnetic field is within a cone where |T(k)| >|U(k)|. 
If the field direction is outside this cone, then the Landau-level spec- 
trum is gapped and has no chiral zero mode (see Supplementary 
Information). 

On the lattice, the ‘no-go’ theorem!” guarantees that Weyl fermi- 
ons appear in pairs with Chern numbers of opposite sign. Because the 


1Theoretische Physik and Station Q Zurich, ETH Zurich, 8093 Zurich, Switzerland. @Department of Physics, Princeton University, Princeton, New Jersey 08544, USA. Institute of Physics, Chinese 


Academy of Sciences, Beijing 100190, China. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 495 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 1 | Possible types of Weyl semimetals. a, Type-I WP with a point- 
like Fermi surface. b, A type-II WP appears as the contact point between 
electron and hole pockets. The grey plane corresponds to the position of 
the Fermi level, and the blue (red) lines mark the boundaries of the hole 
(electron) pockets. 


Chern number of a WP is not changed by T(k), WPs of different type 
can be chiral/anti-chiral partners of each other. The number of WPs of 
a certain type can be odd, but the total number of WPs must be even 
(for example, there can be one WP1 and one WP2). 

We now describe WTeo, a material we identified to host the new 
WPs. The crystal structure of WTe, is orthorhombic with space group 
Pmn2, (Cj,). Its primitive unit cell contains four formula units. The 
atomic structure is layered, with single layers of W separated from each 
other by Te bilayers and stacked along the z axis (see Supplementary 
Information). The distance between adjacent W atoms is considerably 
smaller along the x axis than it is along the y or z axes, creating strong 
anisotropy. The unit cell has two reflection symmetries: a mirror in the 
y-z plane my, and a glide plane g,, formed by a reflection in the x-z 
plane followed by a translation by (0.5, 0, 0.5). Combined, they form a 
non-symmorphic twofold rotation C (that is, a twofold rotation that 
is combined with a translation by a fraction ofa lattice constant), which 
is important in the following symmetry arguments. 

The result of band-structure calculations (see Supplementary 
Information) without spin-orbit coupling (SOC) is shown in Fig. 2a 
along the I'-X direction, where an intermediate point © = (0.375, 0, 
0) is introduced. In addition to electron and hole pockets, 16 WPs per 
spin are found in WTe; in the absence of SOC (not shown in Fig. 2a). 
Half of these points occur at points of low symmetry with k,# 0; the 
other half appear in the k,=0 plane, where the product of time rever- 
sal and C2 (Cyr= CT) forms a little group. Generically, degeneracies 
on high-symmetry planes are forbidden; however, owing to the Cyr 
symmetry, twofold degeneracies are locally stable at points in the k,=0 
plane. On the I-X line, the spectrum is generally gapped with a band- 
gap of approximately 1 meV, separating valence and conduction bands; 
see Fig. 2a. 

Accounting for spin, but without SOC, bands become doubly degen- 
erate, owing to opposite spin projections. This degeneracy doubles the 
topological charge of each WP because, by SU(2) symmetry, WPs 
corresponding to opposite spins have identical topological charge. 
Infinitesimal SOC cannot gap these WPs, giving a general criterion 
by which to search for Weyl semimetals: WPs are first found without 
SOC on the high-symmetry planes; the effects of SOC on these WPs 
are studied separately. 

In WTe2 SOC is not small. When turned on, it preserves electron 
and hole pockets, but substantially changes the structure of WPs. At 
intermediate SOC, WPs move, emerging or annihilating in pairs of 
opposite chirality. At full SOC, all WPs with k,# 0 are annihilated. 
In the k,=0 plane, double degeneracies at isolated k points are still 
allowed by symmetry. Eight such gapless points are found, formed by 
the topmost valence and lowest conduction bands at full SOC. A pair of 
such points is shown in Fig. 2c. The other three pairs are related to this 
one by reflections. Energetically, both points are located only slightly 
(0.052 eV and 0.058 eV) above the Fermi energy Ep; see Supplementary 
Information for details. 


496 | NATURE | VOL 527 | 26 NOVEMBER 2015 


a b c 
0.5 0.065 
1 (0.1214, 0.0454, 0) [ 
S 
ut o4 
I 
Wy 
J (0.1218, 0.0382, 0)\E 
-0.5 0.045 + 
z K K’ 


Figure 2 | Band structure of WTep. a, Band structure of WTe2 without 
SOC. A fraction of the [-X segment is shown: the point = has coordinates 
(0.375, 0, 0). A bandgap of approximately 1 meV is shown in the inset, 
signalling a gapless point nearby. b, Band structure of WTe2 with SOC. 

c, One of the four pairs of WPs is shown along the line K-K’, where 

K= (0.1208, 0.0562, 0) and K’ = (0.1226, 0.0238, 0). Their locations are 
designated in reduced coordinates (in units of reciprocal lattice constants). 


Establishing degeneracies of bands (and the existence of WPs) com- 
putationally (or by inspection) is prone to finite-size effects: a point 
thought to be a degeneracy point might turn out to have a minuscule 
gap upon increasing computational precision. To rigorously establish 
the presence of WPs, we performed many tests that involve computing 
topological indices. The topological charge (+1) of each WP was found 
using an extension of the Wilson-loop and hybrid-Wannier-centres 
methods”! to type-II Weyl semimetals. Z,, topological indices were 
also computed on several planes (including those in both standard and 
non-standard geometries) in the Brillouin zone. In total, these tests not 
only proved the existence of WPs, but also elucidated the structure of 
the Berry-flux connection between WPs and of the Fermi arcs on the 
surface of WTe). The resultant Fermi-arc structure is consistent with 
the calculations presented below. A detailed description of topological 
indices and ways to obtain them are found in Supplementary 
Information. 

To check the nature of the WPs, we obtained the energy spec- 
trum around them from first-principles calculations and fitted it to 
the theoretical model derived by symmetry analysis (Supplementary 
Information). Considering only linear terms in kj—the momentum rel- 
ative to the position of the WP—the spectrum in equation (1) becomes 


e,(k) =Ak, + Bk, fe*k? + (ak, +ck,) + (bk, + dky)? 


The values of the parameters A, B, a, b, c, d and e are given in the 
Supplementary Information. The kinetic component of the energy 
dominates along the line connecting this WP to its nearest neighbour 
(see Fig. 2c and Supplementary Information). We thus conclude that 
WTez is a type-II Weyl semimetal. 

We now discuss the Fermi surface topology and possible topolog- 
ical Lifshitz transitions in WTe>. The evolution of the Fermi surface 
obtained from first-principles calculations is shown in Fig. 3 for differ- 
ent values of Ep. Owing to reflection symmetries, only part of the k,=0 
plane of the Fermi surface is shown. For Ep =0 eV, the Fermi surface is 
formed of two pairs of electron pockets and two pairs of hole pockets 
(eight pockets in total), which are separated in momentum space. For 
each pair, the larger pocket completely encloses the smaller one, in 
agreement with experiments”. This property is illustrated in Fig. 3a, 
where four halved pockets (two electron and two hole) are shown. The 
other halves are obtained by the glide reflection g,., and the remaining 
four pockets with k, > 0 are obtained by the mirror reflection my. All 
Fermi surfaces have zero Chern numbers when Ep =0. 

When E; is raised, two additional electron pockets appear; the 
previously existing electron pockets persist. The hole pockets shrink 
quickly, two disappearing completely. Each of the remaining two split 
into two disconnected pockets. As a result, there are six electron pock- 
ets and four hole pockets in total (see Fig. 3b). When the Fermi level 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 0.20 b 


0.15 meee 


0.10 


a _ 
Lesa 
[kz 


012 -0.11-013 012 -0.11-0.13 0.12 
ky k k 


‘x x 


-0.11 


Figure 3 | Fermi surface at k,=0. A part of the Brillouin zone is shown. 

a, Ep = 0 eV; electron pockets (blue and green solid lines) and hole pockets 
(red and magenta dashed lines) come in pairs. WP with Chern number +1 
(—1) is shown in red (blue). b, The representative structure of electron and 
hole pockets at higher energies (Er = 0.055 eV shown). There are four hole 
pockets (one shown; dashed magenta line) and six electron pockets (halves 
of three of them shown; blue and green solid lines). The boxed region is 
the region shown in c-e for different values of Er. c, Ep = 0.052 eV is set to 
the lower-energy WP. Contact between electron and hole pockets occurs 
at this WP. d, Er=0.055 eV is set to be between the two WPs. The electron 
and hole pockets are disconnected. The hole pocket encloses a WP with a 
Chern number C= +1. The electron pocket encloses the WP with C= —1 
and its mirror image (not shown); the net Chern number of this pocket is 
zero. e, When E; = 0.058 eV is set to the higher-energy WP, electron and 
hole pockets touch again (shown). They reopen at larger Ep with zero 
Chern numbers. 


is tuned to the first WP, Ep =0.052 eV (corresponding to the addition 
of approximately 0.064 electrons per unit cell), each of the two newly 
appeared electron pockets touches two hole pockets at the positions of 
the WPs, as illustrated in Fig. 3c for part of k,=0 plane. Further increase 
of Ep disconnects the electron and hole pockets again—see Fig. 3d 
for Er = 0.055 eV—but with changed topology: electron pockets still 
have zero Chern numbers because they enclose two WPs of opposite 
charge, related by g,.. The hole pockets have Chern numbers of +1. 
Topologies of the other hole pockets are obtained by changing the sign 
of the Chern number according to the appropriate mirror and glide 
symmetries. The pockets touch again (see Fig. 3e) when the Fermi 
level is tuned to the higher-energy WP, Ep =0.058 eV (corresponding 
to approximately 0.079 additional electrons per unit cell). Upon raising 
Ey further, the pockets disconnect again, and all Fermi-surface Chern 
numbers become zero. 

To facilitate the observation of topological Lifshitz transitions, hydro- 
static pressure is applied. Neighbouring WPs are pushed away from 
each other in k space under compression. In particular, a 0.5% (2%) 
compression increases the distance between the WPs from 0.7% to 2% 
(4%) of the reciprocal vector |G2| (see Supplementary Information for 
a discussion of strain effects, including how to obtain only four WPs). 

Finally, we discuss the topological surface states of WTe2. Owing to 
reflection symmetries, WPs of opposite chirality are projected on top of 
each other on the (100) and (010) surfaces, which hence do not exhibit 
topologically protected surface states. 

For the (001) surface, all the WPs project onto distinct points; hence, 
topological surface states appear. When Ey is tuned to be between the 


LETTER 


10 


lee} 


o>) 


jo} 


0.036 
0.124 


-0.122 


-0.120 
k, 


x 


Figure 4 | Topological surface states. a, Spectral function of the (001) 
surface. The Fermi level (green line) is set to be between the WPs. b, Fermi 
surface of the (001) surface and a Fermi arc connecting hole and electron 
pockets. Green crosses mark the positions of WPs. 


WPs, the hole pocket has non-zero Chern number and a Fermi arc 
emerges from it, connecting it to the WP of opposite Chern number 
inside the electron pocket. Figure 4a illustrates the spectral function 
of the (001) surface, where surface states connecting electron and hole 
bands are clearly visible. The Fermi surface of this surface has a top- 
ological Fermi arc (Fig. 4b) connecting projections of the topological 
hole (Fig. 3c—e) and electron pockets. The other surface state crossing 
the hole pocket emerges from the electron pocket (not seen in Fig. 4) 
and goes back into it, and thus can be pushed into the continuum of 
bulk states (see Supplementary Information). 

Of other transition metal dichalcogenides, another strong candidate 
material is MoTe> (ref. 23), which is reported to be a semimetal resem- 
bling pressurized WTe>. This material can also be used to explore new 
physical phenomena arising in the new topological semimetal phase 
presented here. 


Received 1 June; accepted 22 September 2015. 


1. Wan, X., Turner, A. M., Vishwanath, A. & Savrasov, S. Y. Topological semimetal 
and Fermi-arc surface states in the electronic structure of pyrochlore iridates. 
Phys. Rev. B 83, 205101 (2011). 

2. Volovik, G. E. The Universe in a Helium Droplet (Oxford Univ. Press, 2009). 

3. Weng, H., Fang, C., Fang, Z., Bernevig, B. A. & Dai, X. Weyl semimetal phase in 
noncentrosymmetric transition-metal monophosphides. Phys. Rev. X 5, 
011029 (2015). 

4. Huang, S.-M. et al. An inversion breaking Weyl semimetal state in the TaAs 
material class. Nature Commun. 6, 7373 (2015). 

5. Xu, S.-Y. et al. Discovery of a Weyl fermion semimetal and topological Fermi 
arcs. Science 349, 613-617 (2015). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 497 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


19. 
20. 


i a, 


Ly, B. Q. et a/. Experimental discovery of Weyl semimetal TaAs. Phys. Rev. X 5, 
031013 (2015). 

Weyl, H. Elektron und Gravitation. /. Z. Phys. 56, 330-352 (1929). 

Silaev, M. A. & Volovik, G. E. Topological Fermi arcs in superfluid 3He. Phys. Rev. 
B 86, 214511 (2012). 

ielsen, H. B. & Ninomiya, M. The Adler—-Bell-Jackiw anomaly and Weyl 
fermions in a crystal. Phys. Lett. B 130, 389-396 (1983). 

yuzin, A. A. & Burkov, A. A. Topological response in Weyl semimetals and the 
hiral anomaly. Phys. Rev. B 86, 115133 (2012). 

osur, P. & Qi, X. Recent developments in transport phenomena in Weyl 
emimetals. C. R. Phys. 14, 857-870 (2013). 

olovik, G. E. Kopnin force and chiral anomaly. JETP Lett. 98, 753-757 (2014). 
hang, C. et a/. Observation of the Adler—-Bell-Jackiw chiral anomaly in a Wey! 
emimetal. Preprint at http://arXiv.org/abs/1503.02630 (2015). 

iong, J. et al. Signature of the chiral anomaly in a Dirac semimetal: a current 
ume steered by a magnetic field. Preprint at http://arXiv.org/ 
bs/1503.08179 (2015). 


N 


n 


N= 


n 


20 X< 


. Huang, X. et al. Observation of the chiral-anomaly-induced negative 
magnetoresistance in 3D Weyl semimetal TaAs. Phys. Rev. X 5, 031023 (2015). 
. Xu, G., Weng, H., Wang, Z., Dai, X. & Fang, Z. Chern semimetal and the 


quantized anomalous Hall effect in HgCr2Se4. Phys. Rev. Lett. 107, 186806 
(2011). 


. Burkov, A. A. & Balents, L. Weyl semimetal in a topological insulator multilayer. 


Phys. Rev. Lett. 107, 127205 (2011). 


. Ali, M. N. et a/. Large, non-saturating magnetoresistance in WTes. Nature 514, 


205-208 (2014). 

Nielsen, H. B. & Ninomiya, M. Absence of neutrinos on a lattice: (I). Proof by 
homotopy theory. Nucl. Phys. B 185, 20-40 (1981). 

Soluyanov, A. A. & Vanderbilt, D. Computing topological invariants without 
inversion symmetry. Phys. Rev. B 83, 235401 (2011). 


498 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


21. Yu,R., Qi, X. L, Bernevig, A., Fang, Z. & Dai, X. Equivalent expression of Z> 
topological invariant for band insulators using the non-Abelian Berry 
connection. Phys. Rev. B 84, 075119 (2011). 

22. Pletikosié, |., Ali, M. N., Fedorov, A. V., Cava, R. J. & Valla, T. Electronic structure 
basis for the extraordinary magnetoresistance in WTe2. Phys. Rev. Lett. 113, 
216601 (2014). 

23. Brown, B. E. The crystal structures of WTes and high-temperature MoTes. Acta 
Crystallogr. 20, 268-274 (1966). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements A.A.S., D.G., QS.W. and M.T. acknowledge the support 

of Microsoft Research, the Swiss National Science Foundation through the 
National Competence Center in Research MARVEL and the European Research 
Council through ERC Advanced Grant SIMCOFE. Z.W. and B.A.B. acknowledge 
the support of ARO MURI W911NF-12-1-0461, ONR-NO0014-11-1-0635, NSF 
CAREER DMR-0952428, NSF MRSEC DMR-0819860, the Packard Foundation 
and a Keck grant. X.D. is supported by the National Natural Science 
Foundation of China, the 973 program of China (no. 2011CBA00108 and 

no. 2013CB921700) and the “Strategic Priority Research Program (B)” of 

the Chinese Academy of Sciences (no. XDBO7020100). 


Author Contributions All authors contributed to performing the calculations 
and the analysis of the results. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

AAS. (soluyanov@itp.phys.ethz.ch). 


LETTER 


doi:10.1038/nature16066 


Ultrafast ultrasound localization microscopy for 
deep super-resolution vascular imaging 


Claudia Errico!’, Juliette Pierre!?*, Sophie Pezet*°, Yann Desailly!?°, Zsolt Lenkei**, Olivier Coutureb?3* & 


Mickael Tanter!23* 


Non-invasive imaging deep into organs at microscopic scales 
remains an open quest in biomedical imaging. Although optical 
microscopy is still limited to surface imaging owing to optical wave 
diffusion and fast decorrelation in tissue, revolutionary approaches 
such as fluorescence photo-activated localization microscopy 
led to a striking increase in resolution by more than an order of 
magnitude in the last decade!. In contrast with optics, ultrasonic 
waves propagate deep into organs without losing their coherence 
and are much less affected by in vivo decorrelation processes. 
However, their resolution is impeded by the fundamental limits 
of diffraction, which impose a long-standing trade-off between 
resolution and penetration. This limits clinical and preclinical 
ultrasound imaging to a sub-millimetre scale. Here we demonstrate 
in vivo that ultrasound imaging at ultrafast frame rates (more than 
500 frames per second) provides an analogue to optical localization 
microscopy by capturing the transient signal decorrelation of 
contrast agents—inert gas microbubbles. Ultrafast ultrasound 
localization microscopy allowed both non-invasive sub-wavelength 
structural imaging and haemodynamic quantification of rodent 
cerebral microvessels (less than ten micrometres in diameter) 
more than ten millimetres below the tissue surface, leading to 
transcranial whole-brain imaging within short acquisition times 
(tens of seconds). After intravenous injection, single echoes from 
individual microbubbles were detected through ultrafast imaging. 
Their localization, not limited by diffraction, was accumulated over 
75,000 images, yielding 1,000,000 events per coronal plane and 
statistically independent pixels of ten micrometres in size. Precise 
temporal tracking of microbubble positions allowed us to extract 
accurately in-plane velocities of the blood flow with a large dynamic 
range (from one millimetre per second to several centimetres 
per second). These results pave the way for deep non-invasive 
microscopy in animals and humans using ultrasound. We anticipate 
that ultrafast ultrasound localization microscopy may become an 
invaluable tool for the fundamental understanding and diagnostics 
of various disease processes that modify the microvascular blood 
flow, such as cancer, stroke and arteriosclerosis. 

The recent discovery of super-resolution optical microscopy led to 
a revolutionary improvement of resolution through the use of differ- 
ent technical approaches!”. One major implementation, fluorescence 
photo-activated localization microscopy (FPALM), exploits the 
stochastic blinking of specific fluorescent sources to separate them 
into individual events in independent frames. A super-resolved image 
is obtained by localizing the centre of each separable source and accu- 
mulating these positions over thousands of acquisitions. The resulting 
image highlights structures that are hundreds of times smaller than the 
wavelength, such as the cell membrane and small organelles’. 

In clinical ultrasound imaging, intravenously injected contrast 
agents (1-3-j1m-diameter microbubbles) act as intravascular acoustic 


sources to reveal the vascular bed. At typical concentrations, a cloud 
of microbubbles can be considered as a sub-wavelength random dis- 
tribution of Rayleigh scatters. The resolution of ultrasound contrast 
imaging is limited by the classical wave diffraction theory and cor- 
responds roughly to the ultrasonic wavelength (typically between 
200m and 1 mm in clinical applications). Nevertheless, thanks to 
the advent of ultrafast ultrasound imaging’, we recently proposed an 
ultrasound equivalent of FPALM®* that surpassed the conventional 
diffraction limit of echography by more than tenfold. The use of ultra- 
fast acquisitions based on plane wave transmissions at the rate of a 
thousand frames per second may lead to several key advantages when 
imaging contrast agents. First, the decorrelation of the microbubble 
signal from frame to frame is typically in the millisecond range’. 
As the tissue signature decorrelates more slowly than the microbubble 
signal, it is thus removed by simply applying a differential subtraction 
filter of consecutive frames. Second, since they respond to ultrasound 
differently over several frames, microbubbles blink separately through 
the spatiotemporal differentiation process and become temporally sep- 
arable sources. Last, since the ultrasonic sequence provides simulta- 
neously very high temporal resolution in all pixels of the image, it 
becomes possible to track the signature of many individual micro- 
bubbles both in space and time and thus to quantify the local blood 
flow speed over a very large dynamical range. As ultrasonic waves 
can penetrate several centimetres of tissue, extracting the positions of 
each of these bubbles could lead to the full reconstruction of the deep 
vascular system down to the level of capillaries. However, the useful- 
ness of these theoretical benefits remains to be demonstrated in vivo. 

Current methods for in vivo microvascular imaging are limited by 
trade-offs between the depth of penetration, resolution and acqui- 
sition time. For instance, microcomputed tomography® and mag- 
netic resonance imaging” are able to resolve vessels down to a few 
tens of micrometres with deep tissue penetration, but they remain 
limited by long scanning times. Near-infrared II fluorescence 
imaging’® has high spatial resolution (~50 1m) and fast acquisition 
times (<200 ms). Nevertheless, it lacks sufficient tissue penetration 
(<1-3 mm) for whole-brain imaging. High-resolution photoacoustic 
imaging!! does not require contrast agents and can attain resolutions 
of a few micrometres, but also lacks penetration (0.75 mm). Finally, 
acoustic angiography resolves tumour vessels around 150,.m in 
diameter, but is still hampered by the trade-off between penetration 
and resolution”. 

Here, we demonstrate ultrafast ultrasound localization microscopy 
(uULM), which combines deep penetration and super-resolution 
imaging at unprecedented spatiotemporal resolution, by using clin- 
ically approved contrast agents: inert gas microbubbles. uULM is 
implemented in vivo on anaesthetized male Sprague-Dawley rats fixed 
within a stereotactic frame. Their skull was either left intact or thinned 
to reduce the acoustic attenuation caused by the bone. We used a small 


INSERM, Institut Langevin, 1 rue Jussieu, 75005 Paris, France. @Institut Langevin, ESPCI-ParisTech, PSL Research University, 1 rue Jussieu, 75005 Paris, France. >CNRS UMR 7587, 1 rue Jussieu, 
75005 Paris, France. “CNRS, UMR 8249, 10 rue Vauquelin, 75005 Paris, France. *Brain Plasticity Unit, ESPCI-ParisTech, PSL Research University, 10 rue Vauquelin, 75005 Paris, France. 


*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 499 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ultrasonic probe, connected to a fully programmable ultrafast ultra- 
sound scanner to image a coronal slice of the brain. 

The major challenge of ULM is to intercept a sufficient number 
of separable sources (microbubbles) in the blood stream to obtain 
super-resolved vasculature maps over a large region within a reason- 
able acquisition time. Therefore, we detected microbubbles in the 
rat brain cortex by looking at their fast decorrelation within a stack 
of 75,000 images acquired continuously for 150s. The millisecond- 
timescale decorrelation of the microbubble signal can be generated 
by several processes, including disruption, dissolution and motion. In 
the current implementation, pulse sequences were chosen to reduce 
ultrasound-induced disruption or dissolution of microbubbles. As 
microbubbles are point-scatters and since small variations of phase 
can be detected in the radio-frequency data, microbubble displace- 
ment much smaller than the wavelength appears as a strong decorre- 
lation signal on differential filtered images. Moreover, by exploiting 
the coherence of backscattered signals, the spatiotemporal filtering 
approach discriminates slowly moving objects of sub-wavelength 
size (low spatial coherence), that is, bubbles, from slow motion tissue 
signals whose temporal variations affect many neighbouring pixels 
the same way (high spatial coherence). The ultrafast frame rate was 
achieved by emitting plane waves and collecting the backscattered 
echoes with all the array elements. For each transmission, the resulting 
echoes were exploited to reconstruct in silico an entire ultrasonic frame 
by using parallel beamforming. In the averaged stack of ultrasound 
images only the thinned skull was observable (Fig. 1a). The decorre- 
lation of bubbles was detected using frame-to-frame differential pro- 
cessing, which yields individual and fast-changing sources within the 
ultrafast ultrasound images (Fig. 1b). This high-pass filter uses the very 
high spatiotemporal sampling to eliminate tissue and skull signals. 
Since microbubbles are much smaller than the wavelength (1-3 »m 
versus 100|1m) and can be individually separated in space and time, 
they appeared as the point-spread function (PSF) of the ultrasound 
system. The spatial coordinates of the bubble centroids were extracted 
one by one by deconvolving the individual sources from the predicted 
Gaussian PSF. As these sources are locally unique, each of these posi- 
tions can be estimated with a 2.5 |1m maximum theoretical resolution 
in the axial direction. For example, a blinking microbubble flowing in 
vessels at the level of the primary somatosensory forelimb or hindlimb 
cortex (S1HL/FL), appeared as a spot representing the centre of the 
interpolated PSF (Fig. Ic). 

Typically, we localized in 150s about 1,000,000 events within one 
hemisphere of the brain cortex. Furthermore, we were able to track 
each moving bubble according to its instantaneous position and 
in-plane velocity vector, leading to quantitative and localized maps 
of cerebral blood flow velocity. Hence, ultrafast imaging allows the 
reconstruction of entire organs within tens of seconds, a prerequi- 
site for a preclinical and clinical modality. Far beyond a technological 
leap, ultrafast imaging ensures the necessary discrimination between 
single bubble signatures and tissue at high bubble concentrations 
using optimal spatiotemporal clutter filters!>. By tracking the local 
motion of bubbles at a kHz rate, it estimates their motion over a very 
large dynamic range of velocities and consequently vessel diameters 
(1mm s_! to several cms! and 151m to 1-5 mm, respectively) 
during a sufficiently long acquisition time simultaneously in all vox- 
els of the image. Finally, in fast-moving or pulsatile organs, tissue 
motion correction could be assessed through speckle tracking with 
micrometric sensitivity to co-register bubble positions in real time or 
post-processing®!*!°, This remains a fundamental asset with respect 
to individual bubble localization techniques based on conventional 
ultrasound sequences, recently!® discussed, which need to separate 
echoes through high dilution of contrast agents and image clamped 
tissue for extended durations (1h) because of limited frame rates!’. 

We obtained extremely detailed structural reconstructions of the 
microvasculature in the rat brain cortex (5mm width and 3mm 


500 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Figure 1 | Principle of uULM. a, Ultrafast detection of individual sources 
from a low-quality B-mode image (averaged stack of 250 beamformed 
images), through a thinned skull. b, Four representative frames were 
separated by 44 ms (t,-t,) and filtered to remove the slow-moving 

tissue signal. c, Three independent microbubbles blinking over several 
milliseconds from b were followed in the region of interest within 

the cortex. The echo of each bubble event (high-contrast pixels) was 
deconvolved with the PSF to obtain the exact position of the centroid 
(red crosses). Superposition of thousands of occurrences yields a highly 
resolved localization map for this region. 


depth) under the thinned skull window (Fig. 2a), displaying vessels 
with diameters between 151m and 65\1m. The images were recon- 
structed with a pixel size of 101m x 8m, corresponding to a tenfold 
increase in resolution as compared to conventional ultrasound imag- 
ing. Furthermore, bifurcations of the penetrating arterioles within 
the S1HL/FL were easily observable down to the terminal branching 
points (Fig. 2a), where vessels attain the hypovascular white matter!®, 
In comparison, the contrast-enhanced image created using conven- 
tional power Doppler is limited by diffraction (Fig. 2b)”, highlighting 
only the large vessels of the rat brain cortex without distinguishing 
details below the wavelength scale. Moreover, Doppler detection is 
strongly biased towards flows that are perpendicular to the array. 

More detailed analysis of the cross-section of individual vessel 
profiles, indicated by lines 1 and 2 in Fig. 2a, yielded diameter sizes 
of 171m and 9m full-width at half-maximum, respectively, corre- 
sponding to capillaries”° (Fig. 2c). These values represent a convo- 
lution between the actual size of the vessel and the response of the 
localization microscopy method, giving an upper limit to its resolution 
(wavelength 4/10). Investigation of a branching vessel profile (profile 3 
in Fig. 2a) showed that at a distance of 161m (X/6), the two vessels are 
still clearly separated. Such high resolution depends on the number of 
bubbles present in the reconstructed pixel (10,1m x 8 1m) and could 
thus be further improved with longer integration times. 

Next, we evaluated the ability of our method to measure blood flow 
dynamics in cortical microvessels. Measured blood flow in-plane 
velocities in the rat brain showed a large dynamic range up to several 
cms! for large vessels and down to 2mm s! in small vessels. Blood 
flow velocity inside of the relatively large penetrating artery was well 
resolved (profile 4 in Fig. 2d, e, profile 5 in Fig. 2d, f) and was inversely 
correlated with vessel diameter, showing 15 mm s~' maximum velocity 
at 80,.m diameter and 2mm s-! maximum velocity at 151m diameter, 
consistent with the literature values*”*. Interestingly, it was clear that 
larger vessels support higher flow within their centre with respect 


© 2015 Macmillan Publishers Limited. All rights reserved 


500 um 


Amplitude (a.u.) 


40 “80 
Distance (um) 


Velocity (mm s~') 


-14 10 5 0 5 10 14 
e f 
ince \, n= 62 0 ae 
a, re 
— | wie : tite 
© 49] _ n=28 _ eee tat 
eal Bee n=90F) f | 
= n=50 fr TT Qartie 
% I 70 
6 5 es n=235 
g por “Te? f= 120) 
na oe 


15, 
0 10 20 30 40 50 60 70 80 90 100 
Distance (um) 


0 
0 10 20 30 40 50 60 70 80 90 100 
Distance (um) 


Figure 2 | Spatial resolution and quantification of uULM in the 

rat brain cortex through a thinned skull window. a, Microbubble 
density maps were reconstructed with a spatial resolution of /10 (pixel 
size = 8j1m Xx 10j1m). b, Same area in a conventional power Doppler 
image. c, Interpolated profiles along the lines marked in a display 91m 
vessels (2) and resolve two vessels closer than 161m (3). a.u., arbitrary 
units. d, Dynamic tracking of bubbles separates vessels in two populations 
with opposite blood flow direction. Positive values indicate blood flow 
distancing from the probe. Bubble velocities between 1mm s~! and 14mm 
s ! are detectable. e, f, Velocity profiles associated with lines 4 (e) and 

5 (f) in d. Red line, median; blue box, 25th to 75th percentile; whiskers 
extend to the most extreme data points that are not considered outliers; 
other points, outliers. Unpaired Student's t-test. *P < 0.05, **P<0.01, 
***P <0.001, ****P <0.0001. 


LETTER 


Velocity (mm x"), 


-14-10 - O 5 10 14 


Figure 3 | ULM of the rat brain through a thinned skull window or 
through the intact skull. a, WULM performed through a thinned skull at a 
coronal section, Bregma —1.5 mm, providing a resolution of 10,1m x 8j.m 
in depth and lateral direction, respectively. c, uULM performed through 
the intact skull at Bregma —1 mm. Owing to the attenuation of the 
ultrasound waves in the presence of the bone, the achieved resolution 

was 12.5,1m x 11m in depth and lateral direction, respectively. Thus, the 
smallest vessel detectable was 201m wide. b, d, In-plane velocity maps 
from parts of the vessels in a and c, respectively. 


to their periphery. Although the images are integrated over a slab of 
about 100m thick, we could separate two sets of vessels simply on 
the basis of their flow velocities. Some bubbles were travelling at a 
much slower speed in the opposite direction than the background 
venules. Moreover, in contrast with conventional ultrasound Doppler 
imaging, which is sensitive mostly to flow towards or away from the 
ultrasound probe, here we also observed and measured microbubbles 
that were moving sideways. This is particularly useful to observe the 
tortuosity of the small vessels and detect abrupt branching in vessels 
within the cortex. 

In-plane velocity measurements can define the resolution of ULM. 
We consider that two resolution cells are distinguishable if their veloc- 
ity distributions are statistically different (P < 0.05). The median of 
the upper half of the velocity distribution for each resolution cell is 
displayed in Fig. 2e, f. When the resolution cells are 8-12 1m in size, 
adjacent pixels can be considered distinct. Interestingly, the maximum 
velocities follow a parabolic profile, as expected for vessels of this size. 

Finally, we investigated the spatial coverage of our imaging method. 
At 15 MHz, the attenuation of an ultrasound wave within brain tis- 
sue is approximately 5dB cm! (ref. 23), which allows imaging at 
several centimetres depth. Super-resolved images could be obtained 


26 NOVEMBER 2015 | VOL 527 | NATURE | 501 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


in vivo over the entire depth of the brain (12.5 mm at Bregma — 1.5mm; 
Fig. 3a), demonstrating that ULM can map vessels below the rat 
brain cortex over several coronal planes (Extended Data Fig. 2). 
Super-resolved imaging is also possible through the intact skull 
(Fig. 3c; Bregma —1.0 mm) but the lower signal-to-noise ratio, result- 
ing from skull-induced signal attenuation, globally reduces the num- 
ber of localized microbubbles, increasing the limit of the smallest 
detectable vessel. However, this non-invasive version of our imag- 
ing method can still detect vessels that are 201m wide and distin- 
guish vessels that are 20|1m apart deep into the brain (>8 mm). In 
the future, the resolution could be further improved by localizing the 
microbubbles directly from radio-frequency data, which could also 
allow the correction of aberrations from the skull**”°. 

In conventional clinical ultrasound imaging applications, resolu- 
tion is inherently correlated to the ultrasonic frequency and, con- 
sequently, is inversely correlated to penetration depth. However, in 
uULM, resolution is related to the signal-to-noise ratio, the bandwidth 
of backscattered echoes and the number of array elements used in 
the beamforming process. This indicates that very high resolution 
could be reached, even deep into organs, in clinical applications. As 
microbubbles are clinically approved contrast agents and our acoustic 
parameters are well within the US Food and Drug Administration 
guidelines, such clinical applications could be rapidly implemented 
with conventional transducers. For these reasons, it is conceivable that 
dynamic images of the human brain vasculature could be achieved 
with lower frequency ultrasound (around 1 MHz) that can penetrate 
the skull. Ultrafast ultrasound localization could also be applied to 
other deep-seated organs such as liver, kidney or breast, currently 
imaged with ultrasound by implementing appropriate motion- 
correction algorithms. Such algorithms can be performed through 
image registration based on the cross-correlation of the radio- 
frequency signal acquired at high frame rates, which can detect motion 
at the micrometric scale*!*!°. The microbubble events necessary for 
uULM can then be motion compensated thanks to this co-registered 
image. Consequently, this technique will probably have an important 
impact on the study and diagnostics of normal biological processes or 
diseases such as tumour-related angiogenesis. 

We demonstrate super-resolution images of rat brain microvessels 
with pixel sizes comparable to the size of red blood cells, indicating 
that vessels ten times smaller than the ultrasonic wavelength can 
be mapped. Since ultrafast localization imaging can be performed 
through the skull, non-invasive longitudinal studies may be envisioned 
in the future over single or multiple planes within very reasonable 
acquisition times in preclinical or clinical studies. ULM, by removing 
the diffraction-induced trade-off between resolution and penetration 
of ultrasound waves, emerges as the first in vivo technique for imaging 
and quantifying blood flow at microscopic resolution deep into living 
organs. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 12 June; accepted 30 September 2015. 


1. Hell, S. W. & Wichmann, J. Breaking the diffraction resolution limit by 
stimulated emission: stimulated-emission-depletion fluorescence microscopy. 
Opt. Lett. 19, 780-782 (1994). 

2. Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer 
resolution. Science 313, 1642-1645 (2006). 


502 | NATURE | VOL 527 | 26 NOVEMBER 2015 


3. Huang, B., Babcock, H. & Zhuang, X. Breaking the diffraction barrier: 
super-resolution imaging of cells. Ce// 143, 1047-1058 (2010). 

4. Tanter, M. & Fink, M. Ultrafast imaging in biomedical ultrasound. /EEE Trans. 
Ultrason. Ferroelectr. Freq. Control 61, 102-119 (2014). 

5. Couture, O., Tanter, M. & Fink, M. Method and device for ultrasound imaging. 
Patent Cooperation Treaty (PCT)/FR2011/052810 (2010). 

6. Desailly, Y., Couture, O., Fink, M. & Tanter, M. Sono-activated ultrasound 
localization microscopy. Appl. Phys. Lett. 103, 174107 (2013). 

7. Couture, O. et a/. Ultrafast imaging of ultrasound contrast agents. Ultrasound 
Med. Biol. 35, 1908-1916 (2009). 

8. Chugh, B. P. et al. Measurement of cerebral blood volume in mouse brain 
regions using micro-computed tomography. Neuroimage 47, 1312-1318 
(2009). 

9. Huang, C.-H. et al. High-resolution structural and functional assessments of 
cerebral microvasculature using 3D Gas AR2*-mMRA. PLoS One 8, e78186 
(2013). 

10. Hong, G. et a/. Multifunctional in vivo vascular imaging using near-infrared II 
fluorescence. Nature Med. 18, 1841-1846 (2012). 

11. Yao, J. et al. High-speed label-free functional photoacoustic microscopy of 
mouse brain in action. Nature Methods 12, 407-410 (2015). 

12. Gessner, R. C., Frederick, C. B., Foster, F. S. & Dayton, P. A. Acoustic 
angiography: a new imaging modality for assessing microvasculature 
architecture. Int. J. Biomed. Imaging 2013, 936593 (2013). 

13. Demene, C. et a/. Spatiotemporal clutter filtering of ultrafast ultrasound data 
highly increases Doppler and fUltrasound sensitivity. /EEE Trans. Med. Imaging 
PP, 2271-2285 (2015). 

14. Tanter, M., Bercoff, J., Sandrin, L. & Fink, M. Ultrafast compound imaging for 
2-D motion vector estimation: application to transient elastography. /EEE Trans. 
Ultrason. Ferroelectr. Freq. Control 49, 1363-1374 (2002). 

15. Denarie, B. et al. Coherent plane wave compounding for very high frame rate 
ultrasonography of rapidly moving targets. /EEE Trans. Med. Imaging 32, 
1265-1276 (2013). 

16. Viessmann, O. M., Eckersley, R. J., Christensen-Jeffries, K., Tang, M. X. & 
Dunsby, C. Acoustic super-resolution with ultrasound and microbubbles. 
Phys. Med. Biol. 58, 6447-6458 (2013). 

17. Christensen-Jeffries, K., Browning, R. J., Tang, M.-X., Dunsby, C. & Eckersley, R. J. 
In vivo acoustic super-resolution and super-resolved velocity mapping using 
microbubbles. /EEE Trans. Med. Imaging 34, 433-440 (2015). 

18. Paxinos, G. & Watson, C. The Rat Brain in Stereotaxic Coordinates 6th edn 
(Academic, 2006). 

19. Szabo, T. L. in Diagnostic Ultrasound Imaging (ed. Szabo, T. L.) 337-380 
(Academic, 2004). 

20. Mishra, A. et al. Imaging pericytes and capillary diameter in brain slices and 
isolated retinae. Nature Protocols 9, 323-336 (2014). 

21. Itoh, Y. & Suzuki, N. Control of brain capillary blood flow. J. Cereb. Blood 
Flow Metab. 32, 1167-1176 (2012). 

22. Kamoun, W. S. et al. Simultaneous measurement of RBC velocity, flux, 
hematocrit and shear rate in vascular networks. Nature Methods 7, 655-660 
(2010). 

23. Goss, S.A, Frizzell, L.A. & Dunn, F. Ultrasonic absorption and attenuation in 
mammalian tissues. Ultrasound Med. Biol. 5, 181-186 (1979). 

24. Pernot, M., Montaldo, G., Tanter, M. & Fink, M. ‘Ultrasonic stars’ for 
time-reversal focusing using induced cavitation bubbles. Appi. Phys. Lett. 88, 
034102 (2006). 

25. O'Reilly, M. A. & Hynynen, K. A super-resolution ultrasound method for brain 
vascular mapping. Med. Phys. 40, 110701 (2013). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported principally by the Agence 
Nationale de la Recherche (ANR), within the project ANR MUSLI. We thank the 
Fondation Pierre-Gilles de Gennes for funding C.E. The laboratory was also 
supported by LABEX WIFI (Laboratory of Excellence ANR-10-LABX-24) within 
the French Program “Investments for the Future” under reference 
ANR-10-IDEX-0001-02 PSL*. 


Author Contributions C.E., J.P. S.P., Z.L. O.C. and M.T. designed the experiments; 
C.E. and S.P. performed the experiments; C.E., J.P. and O.C. analysed the data; 
Y.D., 0.C. and M.T. performed the theoretical analysis. All the authors discussed 
the results and wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare competing financial interests: 
details are available in the online version of the paper. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to M.T. (mickael.tanter@espci-fr). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Theoretical resolution limit. The given theoretical resolution limit corresponds 
to the position error of the localization process®. This PSF deconvolution for 
single isolated spots is inherently limited by the number of channels used in 
receive processing and the timing resolution of the acquisition system. The latter 
is limited mainly by the sampling frequency of the echoes before beamforming. 
An approximate value for the theoretical resolution limit in the axial dimension 
can be obtained by propagating the sampling error in a time-of-flight model, 
which yields: 


097 co, /(2n'!?) 


where 0 is the localization error in the axial dimension, c is the sound speed, 
a, is the timing resolution of the system and n is the number of channels used in 
receive processing. Note that the lower limit of the timing resolution is linked to 
the Cramer-Rao lower bound (CRLB), which describes the minimum obtainable 
estimation error variance when using an unbiased estimator. The derivation of the 
CRLB was given by Walker and Trahey for ultrasound”®. For tasks related to ultra- 
sonic displacement estimation. The standard deviation o- of arrival time estimates 
compared to the theoretical one is described by the relation: 


3 
> 
m2 2/207 (B? +12B) 


2. 
1 1 
els 
p SNR 


where fo is the transmit pulse centre frequency, B is the pulse bandwidth, T is the 
kernel size for the time delay estimation, p is the normalized correlation between 
signals (that is, the correlation between the experimental signal and the reference 
signal used for the PSF decorrelation), and SNR is the signal-to-noise ratio of 
receive signals. 

For the lateral resolution, the size of the aperture must also be taken into account 
as in any classical imaging modality: 


Oey 2x 3c. f/(Dn'!?) 


where x0 is the localization error in the lateral dimension, fis the focal length 
and D is the length of the transducer array (which is the imaging aperture here). 
Following these theoretical models, it is predicted that the 15 MHz array used 
in this study could attain a maximum resolution (full-width at half-maximum) 
of 2.5,.m in the axial direction and 5\1m in the lateral direction at 1 cm depth. 
In humans, lower frequencies are exploited to attain 10cm penetration. With the 
same theoretical model, we can predict a 6 |1m isotropic resolution with a current 
transducer matrix (32 x 32 elements, 300.m spatial pitch, 2.5 MHz frequency, 
70% frequency bandwidth, p> 0.9, 12dB SNR at 5cm depth). 

Animals. All experiments were performed in agreement with the European 
Community Council Directive of 22 September 2010 (010/63/UE) and the local 
ethics committee (Comité déthique en matiére dexpérimentation animale no. 59, 
C2EA-59, ‘Paris Centre et Sud’). Accordingly, the number of animals in our study 
was kept to the necessary minimum. Experiments were performed on n= 3 male 
Sprague-Dawley rats (Janvier Labs), weighing 200-225 g at the beginning of the 
experiments. Animals arrived in the laboratory 1 week before the beginning of the 
experiment, and were housed three per cage. They were kept at a constant temper- 
ature of 22°C, with a 12h alternating light/dark cycle (light 7 a.m. to 7 p.m.). Food 
and water were available ad libitum. 

Preparation of the thinned-skull imaging windows. The skull of the rats was 
thinned to 75-100 |.m over an area of approximately 0.6 cm x 0.9cm. The thinned 
window suits the dimension of the ultrasound linear array (0.08 mm per element; 
128 elements = 10.24mm width). The surgical procedure was performed 1-2 days 
before imaging under anaesthesia using intraperitoneal injections of medetomi- 
dine (Domitor; 0.3 mg kg~!) and ketamine (Imalgéne; 40 mg kg '). The head 
of the animal was placed in a stereotaxic frame and the skull bone was drilled 
(Foredom) at low speed with a micro drill steel burr (Fine Science Tools, catalogue 
no. 19007-07). To prevent swelling, or oedema of the cerebral cortex, the skull 
was frequently cooled with saline and an airstream during the thinning proce- 
dure as described previously”’. The thinned window was protected by a small 
(1cm x 1 cm) plastic cover, and the skin was sutured using 5.0 non-absorbable 
Ethicon thread. Preliminary experiments showed that this method enabled good 
quality ultrasound imaging results within 24h to 3 days after the preparation, as 
the bone tends to re-grow. 

Preparation of ultrasound contrast-agent microbubbles. To reconstruct the 
vascular microstructure of the rat brain, 1-5 1m perfluorocarbon-filled micro- 
bubbles (Bracco) were dissolved with 0.9% NaCl to yield an initial concentration of 
2 x 108 microbubbles per ml. This concentration corresponds to approximatively 


LETTER 


500,000 bubbles per ml of blood per injection, which corresponds to the maximal 
dose injected in clinical practice for superficial contrast-enhanced ultrasound”*. 
Ultrafast ultrasound localization microscopy was performed in the brain by 
injecting a maximum of 18 bolus injections (corresponding to 2.7 ml of the initial 
suspension) through the catheterized jugular vein. The coronal ultrafast acqui- 
sitions of the brain were performed every 15 min to guarantee that the injected 
boluses had been cleared out. 

Ultrafast ultrasound imaging sequence. Ultrasound imaging was performed 
using ultrafast Doppler imaging based on compounded plane-wave ultrasound 
transmissions”**°. The hardware of the ultrasound scanner was not modified. 
Ultrafast sequences were initiated and processed through software-based sequence 
encoding and data were imported through a PCI-Xpress fast bus for GPU-based 
post-treatment. 

Owing to its high spatiotemporal resolution (1 ms, 100 1m) (ref. 31), this tech- 
nique can measure small haemodynamic changes related to the neurovascular 
coupling. Real-time B-mode imaging was used to control the placement of the 
probe on the field of view. In detail, we developed a plane-wave compounded ultra- 
fast imaging sequence (three tilted plane waves, —3°, 0° and 3°, pulse-repetition 
frequency PRF = 1,500 Hz) to perform a scan of the entire brain and have a detailed 
overview of its microvasculature over different coronal imaging planes at a high 
frame rate (500 Hz). Our ultrasonic probe is a custom-built array with 160 elements 
and a central frequency of 20.3 MHz (pitch = 0.08 mm, elevation focus = 10 mm). 
Its 15.4 MHz bandwidth allowed the use of this probe at a frequency of 15 MHz. 
The signal from the 16 elements on either side is discarded as it is mounted on a 
fully programmable ultrasound clinical scanner with 256 channels in transmission 
and 128 parallel channels in reception. Data are transferred using a 16x, 6Gbs"! 
PCI express bus and processed using a 12-core 3 GHz Xeon processor, NVidia 
Quadro K5000 Graphical Processing Unit with a bus at 173 Gb s~', providing 
2.1 teraflops. Such software-based architecture enables programming of custom 
transmit/receive sequences where the frame rate of each acquisition can reach 
more than 20 KHz. The linear array was coronally fixed at the anterior—posterior 
coordinates of Bregma —0.5mm and coronally translated for 500j1m with a motor 
to scan and retrieve the vasculature of the whole brain along 2. cm. Each pressure 
transmit pulse consists of 6 cycles (21s duration at 15 MHz) at a 1.5 MPa peak 
rarefaction acoustic pressure (mechanical index = 0.4). These pressure amplitudes 
are chosen to reduce the ultrasound-induced disruption of microbubbles and to 
allow the tracking of these agents over several images. 

Boluses of 1501] microbubbles were injected at the beginning of each ultrafast 

acquisition. Once the scan was completed, we fixed the probe above the Bregma 
—1.0mm to continuously insonify for 150s the rat cortex (3.5 mm depth). Ten 
minutes of acquisition were required per each coronal plane of the whole-brain 
scan (11.6mm depth). In this latter case, we injected two 15011 boluses of contrast 
agents (at the beginning of the ultrafast acquisition and in the middle, 5 min) to 
avoid a drop in the microbubble concentration due to the dynamic of the boluses. 
The backscattered echoes were recorded, beamformed with A-line spacing and 
coherently added to produce an echographic image at each transmission. Successive 
raw images corresponding to three different transmission angles at 1,500 Hz PRF 
are then coherently added to produce one higher-contrast ultrasonic image for 
each set of tilted angles at a 500 Hz frame rate. 
Data treatment for bubble localization. High-pass spatiotemporal filtering was 
implemented on the stack of the ultrafast images to discriminate the high temporal 
components, belonging to the blood signal, from the slow-moving tissue. Next, 
the stack of filtered ultrafast acquisition was rescaled via interpolation, yielding 
super-resolved output images with a pixel size of 10j1m x 81m. Since the bubbles 
are much smaller than the wavelength (1-3 |1m versus 100,1m) and can be individ- 
ually separated in space and time, they appear as the PSF of the ultrasound system. 
This PSF is well behaved with respect to the theory of acoustic diffraction because 
human and animal soft tissues can be considered homogeneous for acoustic prop- 
erties at first-order approximation™”. 

Thereafter, we computed a Gaussian low-pass spatial filter and extracted a 
two-dimensional PSF for deconvolution of the rescaled ultrafast acquisitions. Hence, 
each individual bubble was localized, across all frames in the axial position and in 
depth, with a Gaussian two-dimensional profile whose summit represents the cen- 
troid of each separable source (Extended Data Fig. 1). Only 50% of the maximum 
of the full-width at half-maximum was kept to reconstruct the density maps of the 
bubbles; such thresholding helped cancel unwanted noisy signals. Additionally, to 
avoid any artefact corresponding to independent neighbouring bubble events, only 
bubbles that could be followed for at least 2 ms were included. Eventually the bubbles 
were counted and grouped according to their closeness. Almost 1.2 million bubbles 
were counted in the rat cortex within 74,800 frames. Supplementary Video 1 shows 
the reconstruction of each vessel through the passage of individual microbubbles. 

A displacement vector was drawn between these positions, enabling the evalu- 
ation of the instantaneous in-plane velocities of the bubbles, computed as the rate 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


of displacement from one frame to the next frame divided by the time interval. 
Only tracks composed by more than 5 frames (10 ms) were considered to eval- 
uate the velocities. Coloured velocity maps were constructed using the bubble 
paths associated with their in-plane velocities (Fig. 2d). More specifically, blue 
corresponds to the velocities towards the top and the red refers to the in-plane 
velocities towards the bottom. Taken separately but treated equally, the veloc- 
ity maps were exploited to retrieve the velocity profiles of each downstream 
and upstream micro-vessel. In Fig. 2e, we selected two representative vessels: 
(4) and (5), whose velocities were oriented towards the bottom and towards 
the top, respectively. We evaluated the number of bubbles in a fixed-resolu- 
tion cell (Axz), across the sections of the two chosen vessels, and extracted 
50% of the fastest bubbles. Then, we measured the mean + standard devia- 
tion of each thresholded in-plane velocity vector and performed an unpaired 
Student’s t-test. When Axz was chosen between 81m and 121m, the quan- 
tification of the velocity distribution for each resolution cell gave a result that 
was statistically different from the adjacent one (P< 0.05). Finally, ULM was 
performed to reconstruct the vascular network and quantify the velocity maps 
in the whole brain. In Extended Data Fig. 2, we show how the microvasculature 
of the brain was retrieved with high resolution in depth (11.6 mm) along dif- 
ferent coronal imaging planes (from Bregma —0.5 mm to Bregma —4.5 mm). 
Each of these ultrasound acquisitions was detached in three panels of 4mm 
depth to properly filter out the thinned skull bone. Supplementary Video 2 
shows the various coronal slices taken during the experiments. It should be 


noted that the same filter was applied to reconstruct the vasculature of the 
cortex in Figs 2a and 3a, and Extended Data Fig. 2a-i. The in-plane velocity 
maps in Extended Data Fig. 3 were attained with the same data treatment as 
Figs 2d and 3b. They enable the quantification of velocity distributions in depth 
in the whole brain, corresponding to the coronal imaging plans in Extended 
Data Fig. 3. 


26. Walker, W. F. & Trahey, G. E. A fundamental limit on delay estimation using 
partially correlated speckle signals. /EEE Trans. Ultrason. Ferroelectr. Freq. 
Control 42, 301-308 (1995). 

27. Osmanski, B.-F., Pezet, S., Ricobaraza, A., Lenkei, Z. & Tanter, M. Functional 
ultrasound imaging of intrinsic connectivity in the living rat brain with high 
spatiotemporal resolution. Nature Commun. 5, 5023 (2014). 

28. Dietrich, C. F. et a/. An EFSUMB introduction into Dynamic Contrast-Enhanced 
Ultrasound (DCE-US) for quantification of tumor perfusion. Ultraschall Med. 33, 
344-351 (2012). 

29. Bercoff, J. et al. Ultrafast compound Doppler imaging: providing full blood flow 
characterization. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 58, 134-147 
(2011). 

30. Errico, C., Osmanski, B.-F., Pezet, S., Couture, O., Lenkei, Z. & Tanter, M. 
Transcranial functional ultrasound imaging of the brain using microbubble- 
enhanced ultrasensitive Doppler. Neurolmage 124, 752-761 (2015). 

31. Macé, E. et al. Functional ultrasound imaging of the brain. Nature Methods 8, 
662-664 (2011). 

32. Duck, F. A. Physical Properties of Tissues: A Comprehensive Reference Book 
(Academic, 2013). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


0 1 2 3 4 5 
Lateral position (mm) 


Extended Data Figure 1 | Schema of the temporal and spatial frames are separated by 44 ms (1-4). d, Computed two-dimensional PSF 
localization of unique sources. a, Stack of B-mode images. The region of the rescaled and filtered ultrafast acquisitions. These echoes are then 

of interest corresponds to a region of 2mm x 1.1 mm within the cortex. interpolated and the Cartesian coordinates of their centre is obtained 

b, Spatiotemporal filtering of the B-mode images shows the presence of (1-4). The summit of each two-dimensional Gaussian profile identifies the 


decorrelating microbubbles in each frame (1-4). c, The four representative centroid of each separable source. 


© 2015 Macmillan Publishers Limited. All rights reserved 


~ imm 
b y 


Bregma -1mm Bregma -1.5mm 


(oe — 


| j 


a 


\ 


( 
2 
LOR; 


nig 
Mah 


le °} 


1mm g . = 1mm ' ae IMM my 


Bregma -2mm Bregma -2.5mm Bregma -3mm 


.™ 
ee imm a @ intm 2 » imm z 


Bregma -3.5mm Bregma -4mm Bregma -4.5mm 


Extended Data Figure 2 | uULM coronal scan (anterior—posterior) of vascularization of the rat brain at the following coordinates: Bregma 

the entire rat brain through a thinned skull window. a-i, The —0.5mm (a), —1 mm (b), —1.5mm (c), —2 mm (d), —2.5mm (e), —3 mm 
ultrasound probe was driven by a micro-step motor to perform uULM (f), —3.5 mm (g), —4mm (h), —4.5mm (i). 

on different imaging planes separated by 500 um. We reconstructed the 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


OR Neh 


Bregma -3.5mm Bregma -4mm Bregma -4.5mm 


Velocity (mm/s) 


-14 -10 -5 0 5 10 14 


Extended Data Figure 3 | Anterior—posterior scan of in-plane velocity maps of the rat forebrain through a thinned skull window. a-i, Velocity maps 
for the different coronal planes presented in Extended Data Fig. 2. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15734 


Extra adsorption and adsorbate superlattice 
formation in metal-organic frameworks 


Hae Sung Cho!, Hexiang Deng?**, Keiichi Miyasaka!*, Zhiyue Dong, Minhyung Chol, Alexander V. Neimark*, Jeung Ku Kang, 


Omar M. Yaghi!®° & Osamu Terasaki!” 


Metal-organic frameworks (MOFs) have a high internal surface 
area and widely tunable composition’, which make them useful 
for applications involving adsorption, such as hydrogen, methane 
or carbon dioxide storage*°. The selectivity and uptake capacity 
of the adsorption process are determined by interactions involving 
the adsorbates and their porous host materials. But, although 
the interactions of adsorbate molecules with the internal MOF 
surface!°-!” and also amongst themselves within individual 
pores!®-”” have been extensively studied, adsorbate-adsorbate 
interactions across pore walls have not been explored. Here we 
show that local strain in the MOF, induced by pore filling, can give 
rise to collective and long-range adsorbate-adsorbate interactions 
and the formation of adsorbate superlattices that extend beyond 
an original MOF unit cell. Specifically, we use in situ small-angle 
X-ray scattering to track and map the distribution and ordering 
of adsorbate molecules in five members of the mesoporous MOF- 
74 series along entire adsorption-desorption isotherms. We find 
in all cases that the capillary condensation that fills the pores 
gives rise to the formation of ‘extra adsorption domains’—that is, 
domains spanning several neighbouring pores, which have a higher 
adsorbate density than non-domain pores. In the case of one MOF, 
IRMOF-74-V-hex, these domains form a superlattice structure 
that is difficult to reconcile with the prevailing view of pore- 
filling as a stochastic process. The visualization of the adsorption 
process provided by our data, with clear evidence for initial 
adsorbate aggregation in distinct domains and ordering before an 
even distribution is finally reached, should help to improve our 
understanding of this process and may thereby improve our ability 
to exploit it practically. 

Figure 1 shows the three distinct types of interaction in which 
adsorbates in MOFs can engage: adsorbate molecules can interact 
with the material’s internal surface (regime A); adsorbates can inter- 
act among themselves within the confines of a pore (regime B); and 
adsorbates can interact with each other across pores mediated by the 
material framework (regime C). Studying the collective adsorbate 
behaviour in regimes B and C requires porous MOF crystals, with 
pores that are large enough to enable the organization and behaviour 
of confined adsorbates to be observed, and with pore walls that are 
atomically thin and well-defined so as to allow observation of any 
local perturbations resulting from adsorption. In such systems, we can 
then use in situ small-angle X-ray scattering (SAXS) to detect long- 
range ordering of adsorbates in multiple pores at precisely controlled 
temperatures and pressures. 

We chose the five mesoporous MOFs with isoreticular structure 
(IRMOF-74-III, IRMOF-74-IV, IRMOF-74-V, IRMOF-74-V-hex 
and IRMOF-74-VII) that are based on the crystalline IRMOF-74 


structure®?-*°, The robustness of the IRMOF-74 honeycomb-like 
structure (in projection) is imparted by one-dimensional, rod-shaped 
magnesium oxide units that run along the pore direction and are held 
together by organic linkers (Fig. 2a). This rigid oxide unit allows for 
structural refinements in two dimensions, by keeping constant the 
structure along the c axis of the original MOF structure® (Fig. 2b). 
Thus, we apply the projected symmetry of the two-dimensional space 
groups (plane groups) p3 or p6 for the unit cell (Fig. 2b, green paral- 
lelogram). We therefore need only two variables, h and k, to specify 
the reflections with the h and k indices for the refinement. This allows 
us to focus on the adsorption region, and stops us from having to deal 
unnecessarily with the more complicated original symmetry R3 in 
IRMOF-74-IV, IRMOF-74-V and IRMOF-74-VII, or R3 in IRMOF- 
74-II] and IRMOF-74-V-hex (Fig. 2b, red parallelogram). 

All of these MOFs exhibit open porosity and have mesopores with 
sizes of 22 A, 28 A, 35 A and 49 A (for IRMOE-74-IIL, IRMOF-74-IV, 
IRMOF-74-V and IRMOF-74-VII, respectively). IRMOF-74-V-hex, 
having a pore size of 34 A, was constructed with a linker functionalized 


«<» Adsorbate-wall interaction 
«<» Adsorbate-adsorbate interaction 
within an individual pore 
~~» Adsorbate-adsorbate interaction 
across adjacent pores 
mediated by framework 


Adsorbates 


—MOF 


Pore 


Figure 1 | Three adsorbate-interaction regimes in mesoporous MOFs. 
In regime A, adsorbed molecules interact (green arrows) with pore walls. 
In regime B, adsorbates interact amongst each other (blue arrows) within 

a pore. These two types of interaction and the corresponding regimes 

have been well studied. Regime C, however, has not been explored; here, 
adsorbates interact with each other (red arrows) across pore walls, in a way 
that is mediated by the framework. Light blue, molecules adsorbed onto 
the internal pore surface; yellow, molecules in the centre of the pores. 


1Graduate School of Energy, Environment, Water and Sustainability, WCU/BK21Plus, KAIST, Daejeon 305-701, South Korea. Key Laboratory of Biomedical Polymers-Ministry of Education, College 
of Chemistry and Molecular Sciences, Wuhan University, Luojiashan, Wuhan 430072, China. >The Institute for Advanced Studies, Wuhan University, Wuhan 430072, China. “Department of 
Chemical and Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, New Jersey 08854, USA. Department of Chemistry, University of California, Materials Sciences Division at 
Lawrence Berkeley National Laboratory, and Kavli Energy NanoSciences Institute, Berkeley, California 94720, USA. ®King Fahd University of Petroleum and Minerals, Dhahran 34464, Saudi Arabia. 
7Department of Materials and Environmental Chemistry, Berzelii Centre EXSELENT on Porous Materials, Stockholm University, Stockholm SE-10691, Sweden. 


*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 503 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


© Mg . 
eO & Je 
to” NAA Mey 
b 
Gs Bs: Be a. Be 
co ‘e ev sd ca ¥ 
Hs i Pg ~ & Ss : 
e ae eo Ue e 
7 aN 
< ee = > 
Fs Sl \ \ de a 
1 88 Sage N ag N ¥ e 
a 4 aig aie 
i &a ~~ De & i FP 
e e v ¥ ¥ 


Figure 2 | Structure of the IRMOF-74 series in three and two 
dimensions. a, The honeycomb-like structure (in projection) of MOFs 
of the IRMOF-74 series is imparted by one-dimensional, rod-shaped 
magnesium oxide secondary building units, held together by organic 
linkers (III, IV, V, V-hex or VII). b, Green dashes show the two- 
dimensional unit cell, corresponding to plane groups p3 and pé6, that we 


with a hexyl chain (V-hex). The atomically thin walls of the pores in 
these MOFs and their large pore sizes are key factors in their suitability 
for examining the collective behaviour of the adsorbates within and 
across the pores (regimes B and C, Fig. 1). 

In contrast to other in situ adsorption studies, performed using a 
synchrotron beamline!®!”%, we used a laboratory-designed SAXS 
set-up operating in transmission mode with a rotating anode X-ray 
source, a graded confocal optic, and a Kratky block system to create a 
monochromatic beam focusing on the detector. Incorporation of an 
adsorption apparatus in the SAXS system enables measurement of 
both X-ray-diffraction profiles and gas-adsorption isotherms from 
the same sample at a precisely controlled temperature and adsorbate 
pressure (see Supplementary Information, section 1). We illustrate 
SAXS-based adsorption tracking for argon uptake by IRMOF-74-V- 
hex, for which the adsorption process can be divided into stages 1 to 5, 
taking place within the pressure ranges 0 to 0.5 kPa, 0.5 to 27 kPa, 27 to 
33 kPa, 33 to 50 kPa and 50 to 100 kPa, respectively (Fig. 3a). Although 
the shape of the isotherm is similar to that of a type IV isotherm (as 
classified by the International Union of Pure and Applied Chemistry), 
typical for mesoporous materials, the distinct slopes seen in stages 
4 and 5 point to two major differences. To understand the origin of 
slopes, we measured SAXS profiles along the entire adsorption curve, 
among which 11 different gas pressures (Fig. 3a) were selected to rep- 
resent the different stages of the argon adsorption process (Fig. 3b). 

The electron distribution of argon atoms introduced into the pores 
was obtained from 54 (7 independent) reflections, using difference 
Fourier analysis of the measured intensity profile of the argon-filled 
MOF and the calculated intensity profile of the corresponding acti- 
vated MOF without argon (Fig. 3d; see also Supplementary Figs 4 
and 5 and Tables 1-4). (Note that, although we cannot determine 
the argon distribution with atomic resolution owing to limitations 
imposed by the maximum q range (the largest angle that can be 
detected; 4msin@/A) of the SAXS instrument, the resolution is suffi- 
cient to map the electron-distribution trend within the large pores; 


504 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Q 
PS 


Top view of rod-shaped 
secondary building unit 


c 
hs Bs & Est Be 
¥ ¥ ¥ sd ¥ e 
Va 0 a le I 
Be i \ 4 a \ fis ds 
¥ y v Yo ¥ 
x AN 
s ‘ 
&: a \ pu a: \ot 
so ‘e wv N\ e Y os e 
aN Sa aS aN 
a & a Be Ea Be 
e e e ¥ e 


applied for structural refinement. Red dashes shown the J3 x J/3 unit cell, 
which corresponds to the projection of the original space groups R3 and 
R3, used to reveal adsorbate distribution. c, The 2 x 2 superlattice cell 
(purple dashes) and the pores in violet contain a larger number of 
adsorbates, compared with the surrounding pores in blue. 


an exception is the centre point, which might be slightly affected 
by the termination effect in the Fourier synthesis.) The adsorbate 
electron-distribution maps (Fig. 3d) reveal that, as expected and in 
agreement with previous findings’®, argon interacts strongly with the 
open metal sites of the magnesium oxide units in stages 1 and 2. The 
electron-density map of argon at 27 kPa (Fig. 3d) and correspond- 
ing electron density distribution profile (Fig. 3c) show two to three 
cylindrical layers of argon atoms adsorbed onto the walls (regime A in 
Fig. 1). This is followed by argon condensation in the pores, which 
commences at stage 3 and is accompanied by a steep increase in gas 
uptake (Fig. 3a, d). At roughly midway through stage 3, the corre- 
sponding hk = 10 reflection intensity decreases sharply (Fig. 3b; 
Supplementary Fig. 10), while a new broad peak at q=0.10 A“! 
emerges (marked by grey dashes in Fig. 4a)—evidence of collective 
adsorbate—adsorbate interactions). Although the pores are not yet 
completely filled, as indicated by the smeared-out electron density in 
the centre region of the pore (Fig. 3d), the emergence of this broad 
peak unambiguously represents an important point (termed the 
aggregation point) in the initiation of formation of extra adsorption 
domains, whereby adsorbate atoms gather in certain pore regions in 
higher numbers than the average. 

The intensity of this broad peak reaches a maximum as stage 3 turns 
to stage 4 (33 kPa), and then decreases gradually to eventually disap- 
pear at the end of stage 4 (Fig. 4a). Furthermore, the density of argon 
in the centre region increases more than it does around the pore walls 
(Fig. 3d, stage 4) during both the appearance and the disappearance of 
the peak. The correspondence between the appearance/disappearance 
of this peak with the characteristics of the pore-filling process indicates 
that this unusual phenomenon originates from the complex collective 
behaviour of argon: argon atoms are not equally distributed through- 
out the available pores during stages 3 and 4 of the condensation 
process, but instead exhibit density fluctuations that result in extra 
adsorption domains with a higher-than-average argon concentration 
spanning several contiguous pores. 


© 2015 Macmillan Publishers Limited. All rights reserved 


T T 
Homogenization point 


1,600 + 

Organization point] 
i — 
LAggregation 


Argon uptake (STP, cm? g~') 
foe} 
fe] 
oO 


400 4 
4 5 4 
fe) ! 1 ! 
0 20 40 60 80 100 
Pressure (kPa) 
Pe 
i= 
‘3 
250, cf — 100 kPa 
b SéEt = 80 kPa 
329 — 50 kPa 
_ ses — 33 kPa 
3 oss — 30kPa 
& 459 EE & — 27kPa 
> 1501 § so — 15 kPa 
= ria — 0.5 kPa 
5 Og — 0.05 kPa 
= 100, | — Vacuum 
50 
0 


q (A) 
c 
5.02 A 

e 
of 

® 

ra 

Oo 

= 

®O 

me} 

oO 

ro 

2 

fo} 

n 

| 

xt 


Figure 3 | Mapping of argon distribution in IRMOF-74-V-hex. a, Argon 
uptake by IRMOF-74-V-hex at different gas pressures. The isotherm shows 
five stages (1 to 5), with distinct slopes. Three points (red, dark green and 
light blue) are highlighted for the start, end/start and end of two events 
unobserved in type IV isotherms. b, SAXS scattering profiles measured 
along the entire adsorption process at 11 different gas pressures, covering 
the different stages of argon adsorption. The patterns are overlaid in 

linear scale, with colours corresponding to the points in the isotherm. 


Further evidence for the formation of these domains comes from 
the adsorption profile slope in stage 4, which differs strongly from that 
in stage 3. The formation of the domains causes unit-cell contraction 
and associated broadening of the diffraction intensity profiles (hk = 10, 
11, 20, 21, 30, 22 and 31) during stage 3, and expansion of the unit cell 
and sharpening of the associated profiles during stage 4 (Fig. 4b-d). 
These effects are correlated with changes in the local strain of the MOF 
backbone, which results in the emergence of an additional stage in the 
overall adsorption process in IRMOF-74, as indicated by the full-width 
half-maxima (FWHM) of the SAXS profile peaks, and by the unit-cell 
parameters (contraction versus expansion) adopting during stage 3 
maximum and minimum values, respectively (Fig. 4c, d). In stage 4, 
the strain induced by the adsorption heterogeneity starts to smear out 
and the FWHM decreases as more argon atoms enter the pores and 
move towards a more homogenized arrangement, leading to a different 
slope for gas uptake*’. We note that the changes in the FWHM and 
unit-cell parameters of IRMOF-74 during and after mesopore con- 
densation resemble those accompanying gas adsorption in MCM-41 


LETTER 


Stage 3 v Aggregation point 
<i 30 kPa 


al Vv Organization point 
33 kPa 
0.05 kPa 


40 kPa 


Stage 4 v 
Homogenization point 
50 kPa 


Stage 5 v 


100 kPa 


c, Three-dimensional contour map of the electron-density profile of argon 
at 27 kPa. The mesopores are covered by argon at this point. d, Projected 
argon distribution in two dimensions from the three-dimensional contour 
maps. Each two-dimensional map reveals the argon distribution within the 
MOF structure at a certain pressure. The red lines at 27 kPa indicate the 
argon profile projection in two directions (metal site to metal site, and wall 
to wall). 


(a typical mesoporous silica with relatively thick pore walls’’), but 
the magnitude of the changes in our system is much larger than that 
observed for MCM-41. This indicates that the adsorbates stress the 
IRMOF-74 framework, with its thinner MOF walls, more than they 
stress the more-sturdy MCM-41 framework; this also explains why 
a broad peak at low q range and a unique slope at stage 4 could not 
be observed during and after mesopore condensation for MCM-41 
(Supplementary Fig. 12). 

The fate of the extra adsorption domains in IRMOF-74-V-hex 
can be gleaned from the abrupt appearance of superlattice reflec- 
tions in the SAXS patterns (reflections at q=0.25 A~! (marked 
by grey dashes) in Fig. 4a, and at q=0.42 A“! in Supplementary 
Fig. 40) at the start of stage 4. The intensity of the reflections 
decreases as the pressure increases from 33 kPa, and becomes 
zero at 50 kPa (Fig. 4b), accompanied by a decrease in the 
FWHM of all the profile reflections. We infer from these observa- 
tions that extra adsorption domains form, and that the contrast 
between these domains and the surrounding domains increases 


26 NOVEMBER 2015 | VOL 527 | NATURE | 505 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a —50 kPa b 
Broad —47 tee x4 11/2 —1 1/2 peak 
peak “44 KPa peak —3/2 1 peak 
Z\ | BES 3 
ri = a = 
Ss —33 kPa Ko 3, : 
‘0 —33 kPa Pas 
s —31 kPa = 
7) —30 kPa ges es x 
2 \Na ic 
c 2 
2 \\ 8 
- 5 
5 SS = 
2 
0.05 0.10 0.15 020025030 30 40 5 


0  Adsorbates covering 


q (A") Pressure (kPa) 
© 40.0 
= Aggregation point Homogenization point 
of 39.5 
= 0 
8 & 39.0 
# oO 
ce 
2 8 38.5 Organization point 
g -—__] 
38.0 - — 5 
0 20 40 60 80 
Pressure (kPa) 
d 
= 
au 
= 
Ww 


Pressure (kPa) 


Figure 4 | Extra adsorption domains and argon adsorbate superlattice in 
IRMOF-74-V-hex. a, The appearance and disappearance of the broad peak 
(at q=0.093 A~') and the superlattice peak (at q=0.25 A~') in the SAXS 
patterns during the absorption process, with intensity magnified by four in 
the right-hand image. b, Intensity changes of 1+ and +1 superlattice 
reflections. c, Tracking of the unit-cell parameter change of IRMOF-74-V- 
hex during the adsorption process. d, Tracking of the corresponding FWHM 


during stage 3 of the adsorption process (Supplementary Fig. 11); 
moreover, we conclude that once the point of maximum contrast 
(termed the organization point) has been reached, the domains com- 
mence to form the adsorbate superlattice in stage 4. It is the superlat- 
tice formation that relieves local strain—increasingly so as the contrast 
lessens and as more adsorbates fill the pores towards uniform distri- 
bution (Supplementary Fig. 11). 

The precise structure of the superlattice is determined from the 
positions of the reflections at q=0.25 A-'and q=0.42 A~|, mentioned 
above, indexed by hk= 15 and >I and corresponding to an adsorbate 
superlattice structure with a 2 x 2 unit cell (Fig. 2c, purple parallelo- 
gram, and Fig. 4e). Although this structure also gives a reflection of 
50 (q=0.093 A~!) that overlaps with the broad peak, the position and 
intensity of the two observed peaks rule out the possibility of 
a./3 x J3 superlattice (Fig. 2b, red parallelogram, and Fig. 4e) that 
might form through modulation of the MOF structure. In such a case, 
the corresponding ordered reflection would have appeared at 
q=0.29 A“! for hk =; 3 however, these were absent in the SAXS 
patterns. Note also that the line-widths of the adsorbate superlattice 
reflections are much larger than those of fundamental reflections, 
further suggesting that the origin of these extra peaks is not associated 
with the framework lattice. Detailed analysis of the FWHM revealed 
that the size of the superlattice domains is about 400 A. 

Upon further increases in the argon pressure, the extra adsorption 
domains and superlattice reflections disappear at the end of stage 4 
(Fig. 3a). During the next stage (stage 5), the adsorption isotherm 


506 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Stage 3 Ss 


Organization point 


the pore walls of MOFs 


Stage 3 — 


Formation of superlattice of 
adsorbates with different amounts 
in adjacent pores 


Aggregation point 
30 kPa 


Stage 4 Ss 


Homogenization point 


Formation of 
extra adsorption domains 


Stage 3 Ss 


Even adsorbate distribution 
and the beginning of 
uniform pore expansion 
of reflections in the SAXS profile. e, Illustration of the extra adsorption 
domains (aggregation point, 30 kPa) and superlattice (organization point, 
33 kPa) formed as a result of argon being distributed unevenly among 
adjacent mesopores. Green, red and purple dashes indicate the original 
MOE, ./3 x J3, and 2 x 2 unit cell, respectively. The size of the argon 
superlattice domain (dark blue dashes at 33 kPa) is about 400 A. 


shows a new slope, and the electron density in the centre region of 
the pores gradually increases (Fig. 3d, stage 5) and leads to a slight 
unit-cell expansion (Fig. 4c) to accommodate more incoming argon 
atoms in a uniform manner among different pores. This changeover 
point in the isotherm (termed the homogenization point) marks the 
initiation of uniform pore expansion: the adsorbate superlattice disap- 
pears and homogenization of the adsorbate density takes place without 
involvement of the long-range adsorbate—adsorbate interactions that 
are mediated by local strain in the MOF framework. In terms of the 
amount of argon uptake, stages 4 and 5 account for up to 22% of the 
total uptake in IRMOF-74-V-hex. 

An overview of how different SAXS characteristics document 
the different stages in the overall adsorption process is provided in 
Extended Data Fig. 1. The desorption process of argon in IRMOF- 
74-V-hex—which was also carefully studied (Fig. 3a) and compared 
with the adsorption process in detail (Supplementary Figs 7-9, 
Supplementary Tables 5 and 6, and Supplementary Video)—involves 
the same stages as those seen during adsorption. 

The broad peak that is seen at low q values was observed in all 
IRMOFs for all three adsorbates studied (argon, nitrogen and car- 
bon dioxide) (Supplementary Fig. 14). During stage 3, this peak 
was observed in the SAXS intensity profiles at q=0.12 A~! and 
q=0.094 A~! for IRMOF-74-IV and IRMOF-74-V, respectively. 
From the distance distribution function derived from the SAXS 
data in the q range of 0.016 A“! to 0.18 A“! for IRMOF-74-IV, and 
0.016 A~! to 0.16 A“! for IRMOF-74-V and IRMOF-74-V-hex, the 


© 2015 Macmillan Publishers Limited. All rights reserved 


maximum size of individual extra adsorption domains is calculated 
to be approximately 60 A for IRMOF-74-IV, and 70 A for IRMOF- 
74-V and IRMOF-74-V-hex. Although extra adsorption domains were 
seen in all IRMOF-74 compounds during the pore-filling process, the 
intensity of the additional reflections that are attributed to superlattice 
formation was negligible in the case of IRMOF-74-IV and IRMOF- 
74-V. The hexyl chains of IRMOF-74-V-hex thus seem to be important 
in superlattice formation, although pore size will also be relevant (as 
superlattices were not detected in IRMOF-74-VIL, where hexyl] chains 
are present but within the confines of larger pores). 

The changes in the SAXS profiles seen during adsorption and deso- 
rption of all three adsorbates follow similar patterns (Supplementary 
Information, sections 2-5). Intriguingly, we also find that each of 
the three adsorbates desorbs at a different pressure, and that this 
adsorbate-specific desorption pressure is, to a first approximation, 
independent of the exact nature and pore size of the IRMOFs tested 
(Supplementary Tables 5, 9 and 11). This observation is another clear 
piece of evidence that adsorbate-adsorbate interactions within and 
across adjacent pores play a major role in gas uptake and release, both 
at the outset of the desorption process and in the formation of extra 
adsorption domains. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 22 June 2014; accepted 8 September 2015. 
Published online 9 November 2015; corrected online 25 November 2015 
(see full-text HTML version for details). 


1. Kitagawa, S., Kitaura, R. & Noro, S. Functional porous coordination polymers. 
Angew. Chem. Int. Edn 43, 2334-2375 (2004). 

2. Furukawa, H., Cordova, K. E., O’Keeffe, M. & Yaghi, O. M. The chemistry 
and applications of metal-organic frameworks. Science 341, 1230444 
(2013). 

3. Rosi, N. L. et a/. Hydrogen storage in microporous metal-organic frameworks. 
Science 300, 1127-1129 (2003). 

4. Dinca, M. et al. Hydrogen storage in a microporous metal-organic framework 
with exposed Mn?* coordination sites. J. Am. Chem. Soc. 128, 16876-16883 
(2006). 

5. Farha, O. K. et al. De novo synthesis of a metal-organic framework material 
featuring ultrahigh surface area and gas storage capacities. Nature Chem. 2, 
944-948 (2010). 

6. Holst, J. R. & Cooper, A. |. Ultrahigh surface area in porous solids. Adv. Mater. 
22, 5212-5216 (2010). 

7. Makal, T.A., Li, J., Lu, W. & Zhou, H. Methane storage in advanced porous 
materials. Chem. Soc. Rev. 41, 7761-7779 (2012). 

8. Deng, H. et al. Large-pore apertures in a series of metal-organic frameworks. 
Science 336, 1018-1023 (2012). 

9. Nugent, P. et al. Porous materials with optimal adsorption thermodynamics 
and kinetics for CO2 separation. Nature 495, 80-84 (2013). 

10. Rowsell, J. L. C., Spenser, E. C., Eckert, J., Howard, J. A. K. & Yaghi, O. M. Gas 
adsorption sites in a large-pore metal-organic framework. Science 309, 
1350-1354 (2005). 

11. Vaidhyanathan, R. et al. Direct observation and quantification of COz binding 
within an amine-functionalized nanoporous solid. Science 330, 650-653 
(2010). 

12. Yang, S. et al. Selectivity and direct visualization of carbon dioxide and 
sulfur dioxide in a decorated porous host. Nature Chem. 4, 887-894 
(2012). 


LETTER 


13. Serre, C. et a/. Role of solvent-host interactions that lead to very large swelling 
of hybrid frameworks. Science 315, 1828-1831 (2007). 

14. Rabone, J. et a/. An adaptable peptide-based porous material. Science 329, 
1053-1057 (2010). 

15. Scherb, C., Koehn, R. & Bein, T. Sorption behavior of an oriented surface-grown 
MOF-film studied by in situ X-ray diffraction. J. Mater. Chem. 20, 3046-3051 
(2010). 

16. Bureekaew, S. et a/. Control of interpenetration for tuning structural flexibility 
impacts on sorption properties. Angew. Chem. Int. Edn 49, 7660-7664 (2010). 

17. Sato, H. et al. Self-accelerating CO sorption in a soft nanoporous crystal. 
Science 343, 167-170 (2014). 

18. Inagaki, S., Guan, S., Ohsuna, T. & Terasaki, O. An ordered mesoporous 
organosilica hybrid material with a crystal-like wall structure. Nature 416, 
304-307 (2002). 

19. Zhao, D. et al. Triblock copolymer syntheses of mesoporous silica with periodic 
50 to 300 angstrom pores. Science 279, 548-552 (1998). 

20. Joo, S. H. et a/. Ordered nanoporous arrays of carbon supporting high 
dispersions of platinum nanoparticles. Nature 412, 169-172 (2001). 

21. Muroyama, N. et a/. Argon adsorption on MCM-41 mesoporous crystal studied 
by in situ synchrotron powder X-ray diffraction. J. Phys. Chem. C 112, 
10803-10813 (2008). 

22. Miyasaka, K., Neimark, A. V. & Terasaki, O. Density functional theory of in-situ 
synchrotron powder X-ray diffraction on mesoporouscrystals: argon 
adsorption on MCM-41. J. Phys. Chem. C 113, 791-794 (2009). 

23. Rosi, N. L. et al. Rod packings and metal-organic frameworks constructed from 
rod-shaped secondary building units. J. Am. Chem. Soc. 127, 1504-1518 (2005). 

24. Dietzel, PD. C., Blom, R. & Fjellvag, H. Base-induced formation of two 
magnesium metal-organic framework compounds with a bifunctional 
tetratopic ligand. Eur. J. Inorg. Chem. 2008, 3624-3632 (2008). 

25. Lin, L. C. et a/. Understanding CO2 dynamics in metal-organic frameworks with 
open metal sites. Angew. Chem. Int. Edn 52, 4410-4413 (2013). 

26. Bon, V. et al. In situ monitoring of structural changes during the adsorption on 
flexible porous coordination polymers by X-ray powder diffraction: 
instrumentation and experimental results. Microporous Mesoporous Mater. 188, 
190-195 (2014). 

27. Gor, G. Y. et al. Adsorption of n-pentane on mesoporous silica and adsorbent 
deformation. Langmuir 29, 8601-8608 (2013). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements The authors acknowledge K. Ito, K. Sasaki, M. Kuribayashi 
and N. Muroyama (Rigaku America and Japan) and K. Nakai Japan Bel) for 
technical support; N. Fujita and T. Nishimatsu (Tohoku University, Japan), 

H. Furukawa and Y. Zhang (University of California at Berkeley, USA) for their 
input; and A. Sawada (Kyoto University, Japan) for advice in designing the gas 
cell. Financial support was provided by WCU/BK21-+ (to H.S.C., K.M., J.K.K., 
O.M.Y. and O.T.); HIMC of Global Frontier Project (2013M3A6B1078884) 
funded by the Ministry of Science, ICT and Future Planning and Korea Center 
for Artificial Photosynthesis (to J.K.K.); Berzelii Centre EXSELENT on Porous 
Materials (to O.T.); and BASF (Ludwigshafen, Germany) (to O.M.Y.). H.D. and 
Z.D. were supported by the 1000 Talent Plan of China, National Natural Science 
Foundation of China (21471118) and National Key Basic Research Program 
of China (2014CB239203). A.V.N. acknowledges support from the NSF ERC 
‘Structured Organic Particulate Systems’. 


Author Contributions O.T., K.M. and J.K.K. designed and set up the experimental 
system. O.T. and O.M.Y. designed and led the project. H.S.C., K.M. and H.D. 
performed the SAXS experiments. H.D., Z.D. and M.C. prepared samples. 

A.V.N. contributed discussion of the gas adsorption—desorption process. 

H.D., H.S.C., J.K.K., O.MLY. and O.T. prepared the first version of the manuscript 
and all authors contributed to the final version. 


Additional Information Reprints and permissions information is available 

at www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

O.M.Y. (vaghi@berkeley.edu) or O. T. (terasaki@kaist.ac.kr). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 507 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Synthesis of IRMOF-74 series. Organic linkers were synthesized as reported 
previously*. IRMOF-74 samples were synthesized by combining organic linkers 
with Mg(NOs3), in a solution of dimethylformamide, ethanol and water, and then 
heated in an oven at 120°C for 24 hours*. Needle-shaped crystals clustered in 
spherical forms were obtained. These IRMOF-74 samples were evacuated after 
solvent exchange with methanol nine times in three consecutive days to remove 
guest molecules. 

In situ gas adsorption SAXS measurement. The in situ SAXS measurements for 
Ar, CO; and N> adsorption by IRMOF-74s (IIL IV, V, VI and V-hex) were per- 
formed using a SAXS instrument (BioSAXS-1000; Rigaku, USA) equipped with 
a rotating anode X-ray source (FR-E+ Super Bright; Rigaku, Japan) and a gas 
adsorption instrument (BELSORP-max) together with a specially designed cell on 
a cryostat (Bel, Japan). We incorporated a sample cell inside the SAXS instrument, 
with a small chamber connected to the gas adsorption instrument placed outside. 
In addition, we used a large area detector combined with copper Ka radiation from 
a rotating-anode X-ray source to provide precise measurement of both the inten- 
sity and the position of the diffraction peaks within a wide q (=4nsin6/A) range, 
from 0.01 to 0.71 A~!. Measurements were carried out with copper Ka radiation 
in the transmission mode with Confocal Max Flux Mirror, a two-dimensional 
Kratky block and a Pilatus-type detector in the SAXS instrument. The powder 
samples were mounted in two places next to each other at the same adsorbate 
environmental condition: one was within the hollow part of the stainless steel 
rectangular plate covered by polyether ether ketone (PEEK) polymer films, in the 
X-ray path for diffraction; the other was for improving accuracy in measuring gas 
adsorption/desorption isotherms. The assembled samples were connected ther- 
mally to the temperature-controlling cryostat system, where the temperature is 
controlled within + 0.01 K, and to the gas adsorption instrument. The position of 
the sample cell was adjusted to the X-ray pathway within the chamber of the SAXS 
instrument at low temperatures before starting to take measurements. 

A known weight (~0.03 g) of the IRMOF-74-III, IRMOF-74-IV, IRMOF-74-V, 
IRMOF-74-VII and IRMOF-74-V-hex samples was mounted in the sample cell 
and activated at 373 K for 6 hours under vacuum (~0.01 Pa) to remove the guest 
molecules before a series of measurements was taken. Activation of the IRMOF- 
74s was confirmed by comparing the argon isotherm of these MOFs at 87 K and 
CO, isotherm at 273 K with those of the activated sample measured on the tradi- 
tional adsorption instrument (Supplementary Fig. 3). Gases (Ar, N2 or CO?) were 
introduced into the sample cell under measurement temperatures; the gas pressure 
was changed, and then maintained for 5 min after the system reached equilibrium 
(we judge the system to have reached equilibrium if the pressure fluctuation is 
less than 1 Pa for 5 min, which took roughly 30 min), for each measurement. The 
SAXS instrument was synchronized to the gas adsorption measurement and each 
SAXS pattern was collected at each equilibrium point of the sorption isotherms 
(the exposure time for each measurement was 30 minutes). There was no pressure 
change after the SAXS measurement, confirming that the sample with adsorbates 
in the sample cell was at equilibrium. 

Before the actual adsorption/SAXS measurement started, gas adsorption 
without SAXS measurement was performed to confirm the adsorption curve 


and to set up the SAXS measurement points. We then collected SAXS scattering 
profiles at each of the 24 equilibrium points in the adsorption process, includ- 
ing the initial point (in vacuum). Another 21 profiles were collected for the 
desorption process. No transformation in the structure of the backbone of 
IRMOF-74 occurred throughout the whole gas adsorption process, as con- 
firmed by the absence of obvious changes in peak positions in these SAXS pat- 
terns. Moreover, the samples did not show structural differences after in situ gas 
adsorption SAXS measurement, confirmed by adsorption data and SAXS data 
in the vacuum. 

Structural analysis. For the structural analysis of IRMOF-74s at different gas 
pressures, Le Bail refinements”* were performed using the JANA program” over 
the full sampled angular range, on the basis of the space group R3 for IRMOF- 
74-IV, IRMOF-74-V and IRMOF-74-VII, and R3 for IRMOF-74-III and IRMOF- 
74-V-hex. The SAXS patterns of activated IRMOF-74 samples in the vacuum 
condition were refined first as a reference. The reflection peaks were modelled 
by a pseudo- Voigt peak-shape function modified for asymmetry, with six refin- 
able coefficients. The background was treated using a Legendre polynomial with 
six refinable parameters. Because the q range for the SAXS instrument could 
cover only hk0 reflections owing to the small unit-cell parameter c, only unit-cell 
parameter a was refined. The standard deviations of all data were derived from 
comparison of observed points in SAXS profiles with corresponding ones cal- 
culated after Le Bail refinement. The atomic coordinates for all MOF samples 
were adopted from the framework structures derived from single-crystal X-ray- 
diffraction data* (Supplementary Tables 1-4). Because the number of reflections 
is limited, framework atomic coordinates were fixed for all data with different 
gas pressures. The distribution of adsorbates was calculated by difference Fourier 
analysis between the observed intensity and calculated intensity after a careful 
check of phase relationships among different reflections, and visualized using 
the VESTA program*”. The calculated intensity was derived from the atomic 
coordinates obtained from single-crystal X-ray-diffraction analysis of MOF 
structures and the atomic coordinates were fixed for different gas pressures. The 
correct phase relationship of the crystal-structure factors between different 
reflections was verified by the fact that we could observe electrons at the open 
metal sites at the beginning of gas uptake. Electron-density-map data were illus- 
trated using /3 x 3 p6 cell, which is the hexagonal projected structure of R3 
and R3, in order to show clearly the electron distribution in the pores (Fig. 1b, 
red parallelogram). The level of electron density (e~ A~*) is represented in blue/ 
green/red colour code for all IRMOF-74 data. All electron-density-map data 
were presented with atomic coordinates of IRMOF-74 to clarify the relative posi- 
tion of adsorbates in the MOF. 


28. Marra, G. L. et al. Cation location in dehydrated Na-Rb-Y zeolite:an XRD and IR 
study. J. Phys. Chem. B 101, 10653-10660 (1997). 

29. Petriéek, V., DuSek, M. & Palatinus, L. Crystallographic computing system 
JANA2006: general features. Zeitschrift Kristal. Crystalline Mater. 229, 345-352 
(2006). 

30. Momma, K. & Izumi, K. VESTA 3 for three-dimensional visualization of 
crystal, volumetric and morphology data. J. Appl. Crystallogr. 44, 1272-1276 
(2011). 


© 2015 Macmillan Publishers Limited. All rights reserved 


c organization point 

2 L_ homogenization point 
5 

n 

me} 

oO 

< 

aggregation point 
b 

x 

2S 

ro) 


Intensity 
(broad peak 


atq 


° 


Intensity 
(superlattice) 


Unit cell 
parameter a 


Full width half 
maximum (FWHM) 


of intensity profile 


Pressure 


Extended Data Figure 1 | The five stages of gas adsorption in 
IRMOF-74s. The five different adsorption stages are indicated in red 
at the top of the figure and their boundaries demarcated throughout all 
panels by grey dashed lines. a, The measured Ar adsorption by IRMOF- 
74-V-hex is shown; it can be compared against relevant SAXS profile 
features of IRMOF-74, measured as a function of Ar pressure, that are 
shown in the other panels. b, The appearance and disappearance of 

the broad peak indicates the formation of extra adsorption domains 
over pores (aggregation, red) and the even distribution of adsorbates 
(homogenization, blue). c, Intensity of 14 superlattice reflection, 
appearing as stage 3 turns to stage 4 (organization, green) and 
disappearing at the end of stage 4 (homogenization, blue). d, Change 
in the unit-cell parameter a of IRMOF-74. e, Change in the 

line-profile width of IRMOF-74. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


OPEN 


doi:10.1038/nature15714 


Single-molecule sequencing of the desiccation- 
tolerant grass Oropetium thomaeum 


Robert VanBuren!*, Doug Bryant", Patrick P Edger?*, Haibao Tang**, Diane Burgess’, Dinakar Challabathula‘t, Kristi Spittle’, 
Richard Hall’, Jenny Gu’, Eric Lyons*, Michael Freeling*, Dorothea Bartels°, Boudewijn Ten Hallers®, Alex Hastie®, 


Todd P. Michael? & Todd C. Mockler! 


Plant genomes, and eukaryotic genomes in general, are typically 
repetitive, polyploid and heterozygous, which complicates genome 
assembly!. The short read lengths of early Sanger and current 
next-generation sequencing platforms hinder assembly through 
complex repeat regions, and many draft and reference genomes 
are fragmented, lacking skewed GC and repetitive intergenic 
sequences, which are gaining importance due to projects like 
the Encyclopedia of DNA Elements (ENCODE)’. Here we report 
the whole-genome sequencing and assembly of the desiccation- 
tolerant grass Oropetium thomaeum. Using only single-molecule 
real-time sequencing, which generates long (>16 kilobases) 
reads with random errors, we assembled 99% (244 megabases) 
of the Oropetium genome into 625 contigs with an N50 length of 
2.4 megabases. Oropetium is an example of a ‘near-complete’ draft 
genome which includes gapless coverage over gene space as well as 
intergenic sequences such as centromeres, telomeres, transposable 
elements and rRNA clusters that are typically unassembled in draft 
genomes. Oropetium has 28,466 protein-coding genes and 43% 
repeat sequences, yet with 30% more compact euchromatic regions 
it is the smallest known grass genome. The Oropetium genome 
demonstrates the utility of single-molecule real-time sequencing for 
assembling high-quality plant and other eukaryotic genomes, and 
serves as a valuable resource for the plant comparative genomics 
community. 

The genomes of Arabidopsis’, rice’, poplar, grape and Sorghum? 
were first sequenced using high-quality and reiterative Sanger-based 
approaches producing a series of ‘gold standard’ reference genomes. 
The advent of next-generation sequencing (NGS) technologies reduced 
costs of sequencing substantially, which has enabled sequencing of over 
100 plant genomes!. The quality of plant genome assemblies depends 
on genome size, ploidy, heterozygosity and sequence coverage, but most 
NGS-based genomes have on the order of tens of thousands of short 
contigs distributed in thousands of scaffolds. The short read lengths of 
NGS, inherent biases and non-random sequencing errors have resulted 
in highly fragmented draft genome assemblies that are not complete, 
which means they are missing biologically meaningful sequences 
including entire genes, regulatory regions, transposable elements, 
centromeres, telomeres and haplotype-specific structural variations. 
It is becoming clear from ENCODE projects that complete genomes 
are needed to better understand the importance of the non-coding 
regions of genomes’. 

More than 40% of calories consumed by humans are derived from 
grasses, and the grass family (Poaceae) is arguably the most important 
plant family with regard to global food security®. The size and complex- 
ity of most grass genomes has challenged progress in gene discovery 


and comparative genomics, although draft genomes are now avail- 
able for most agriculturally important grasses’. The largest genome 
assemblies, such as maize (2,300 megabases (Mb))’, barley (5,100 Mb)® 
and wheat (hexaploid, 17,000 Mb)? are highly fragmented as a result 
of the inability of current sequencing technologies to span complex 
repeat regions. Near-finished reference genomes are available for rice’, 
Sorghum? and Brachypodium"®, but more high-quality grass genomes 
are needed for comparative genomics and gene discovery. Here we pres- 
ent the ‘near-complete’ draft genome of the grass Oropetium thomaeum, 
the first high-quality reference genome from the Chloridoideae sub- 
family. The draft genome is near complete because we were able to 
sequence through complex repeat regions that are unassembled in most 
draft genomes. Oropetium has the smallest known grass genome at 
245 Mb and is also a resurrection plant that can survive the extreme 
water stress such as loss of >95% of cellular water (Fig. 1)!!. 
Single-molecule real-time (SMRT) sequencing (Pacific Biosciences) 
produces long and unbiased sequences, which enables assembly of 
complex repeat structures and GC- and AT-rich regions that are often 
unassembled or highly fragmented in NGS-based draft genomes. We 
generated ~72 x sequencing coverage of the Oropetium genome using 
32 SMRT cells on the PacBio RS II platform (which is equivalent to <1 
week of sequencing time and <US$10,000 in reagents). The resulting 
sequence had a read N50 length of over 16 kilobases (kb), and there was 
10x coverage of reads over 20 kb in length (Extended Data Fig. 1a). The 
raw reads were error-corrected using the hierarchical genome assembly 
process (HGAP), and the longest reads (>16 kb) were assembled using 
Celera assembler followed by two rounds of genome polishing using 
Quiver!”, The assembly contains 650 contigs spanning 99% (244 Mb) 
of the estimated 245 Mb genome size (Extended Data Fig. 1b) with a 
contig N50 length of 2.4 Mb (Extended Data Fig. 1c). The final assem- 
bly consists of 625 contigs after removal of the complete chloroplast 
genome, mitochondria-derived contigs and contaminants. The 35 
largest contigs span half the genome, and the largest 107 contigs contain 
90% of the sequence. The 135,324 base-pair (bp) chloroplast genome 
assembled into a single contig that includes both ~25 kb of inverted 
repeat regions which typically collapse into a single copy during 
assembly. The mitochondria genome was assembled into 20 partially 
overlapping circular chromosomes, which are the product of 
intramolecular recombination events that collectively span 1,100 kb. 
The Oropetium genome has high contiguity for an uncurated 
draft plant genome. The average contig N50 length for all published 
plant genomes is 50kb compared to 2.4 Mb for Oropetium (Extended 
Data Fig. 1d, e). After manual curation and data augmentation, only 
the Arabidopsis (TAIR10)!3, rice (V7) and Brachypodium (V 2.1)'° 
genomes have longer contig N50 lengths. The accuracy rate is very 


1Donald Danforth Plant Science Center, St Louis, Missouri 63132, USA. @Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California 94720, USA. 7>Department 
of Horticulture, Michigan State University, East Lansing, Michigan 48823, USA. 4iPlant Collaborative, School of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA. 5Center for 
Genomics and Biotechnology, Haixia Institute of Science and Technology (HIST), Fujian Agriculture and Forestry University, Fuzhou 350002, China. ®IMBIO, University of Bonn, Kirschallee 1, 
D-53115 Bonn, Germany. ’Pacific Biosciences, Menlo Park, California 94025, USA. 8BioNano Genomics, San Diego, California 92121, USA. °lbis Biosciences, Carlsbad, California 92008, USA. 
+Present address: Department of Life Sciences, School of Basic and Applied Sciences, Central University of Tamil Nadu, Thiruvarur 610101, India. 


*These authors contributed equally to this work. 


508 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 1 | Desiccation tolerance in the resurrection grass Oropetium thomaeum. a, Well watered. b, Desiccated (relative water content <5%) after 
9 days of drought stress. c, Condition 24h post-hydration (relative water content >70%). 


high at 99.99995%, which is similar to Sanger-based approaches and 
higher than most NGS-based assemblies (Extended Data Fig. 1h). 
We plotted repeat density and GC content along the length of the 
contigs to identify factors causing contig breaks (Extended Data 
Fig. 1f, g). There is no correlation between repeat density and GC 
content at contig break points. This suggests that contig break points 
occur at the start of repeats or that most assembly breaks are caused 
by other factors, such as within-genome heterozygosity or haplo- 
type-specific structural variation. To test this, we also tried ‘dip- 
loid-aware’ assemblers Falcon (https://github.com/PacificBiosciences/ 
falcon) and MinHash Alignment Process (MHAP)!*. These assem- 
blies had similar metrics but were less contiguous overall (Extended 
Data Fig. 1i). 

The completeness of the Oropetium genome allowed us to accu- 
rately survey its highly repetitive features that are often unassembled in 
most plant genomes. The Oropetium assembly captures all 18 telomeric 
arrays (Extended Data Table 1) with repeat number ranging from 40 to 
900, suggesting that at least some are full length. Three of the nine cen- 
tromeric satellites are completely assembled into large inverted repeats 
spanning 400 kb with a base monomer length of 155 bp, and higher 
order structures of dimers (310 bp), trimers (465 bp) and tetramers 
(620 bp; Fig. 2, Extended Data Fig. 2 and Supplementary Table 1). The 
remaining 40 centromeric sequences are incomplete centromere repeat 
fragments broken during assembly or solo repeats not associated with 
a larger centromere satellite. Nucleolus organizer regions contain tan- 
dem arrays of the 18S, 5.8S and 25S ribosomal RNA (rRNA) genes and 
typically span several megabase pairs with hundreds of nearly identi- 
cal 10-kb arrays. Twenty-two full-length rRNA tandem arrays in six 
contigs are found in the Oropetium assembly (Extended Data Table 2). 
The largest tandem array contains five identical and one partial 9-kb 
repeats collectively spanning 51 kb; this is approaching the theoretical 
limit given the read-length distributions of our data. The remaining 
rRNA tandem repeats probably collapsed during read correction or 
genome assembly given their high sequence conservation. 

Most repeats are incomplete, unassembled or highly collapsed in 
Illumina/454 NGS-based genomes, which has led to an underestima- 
tion and misclassification of repeat content in most plant genomes. 
Repetitive elements account for a surprisingly high proportion of the 
Oropetium genome (43%) compared to 21% in Brachypodium'®, 35% 
in rice*, 54% in Sorghum? and over 90% in wheat? (Extended Data 
Table 3). Similar to these other genomes, the long terminal repeat (LTR) 
retrotransposons are the most abundant class and account for 35.6% of 
the Oropetium genome. We identified 3,247 intact LTRs in 358 families, 
which is similar to rice (3,663) and Brachypodium (2,162), but far less 
than Sorghum (17,022)!°. Only ~2% of the repeats are unclassified, 
which reflects the completeness of individual repeat elements due to 
the long reads. 

Genome size in the grasses varies by several orders of magnitude as a 
consequence of polyploidy and genome bloating due to repetitive DNA 
accumulation’®, Oropetium has the smallest known genome among the 


grasses!” at 90%, 60%, 50%, 30% and 10% the size of Brachypodium", 
rice’, Setaria'®, Sorghum? and maize’, respectively. We found that 
Oropetium has a solo:intact LTR ratio >1, which is similar to small 
grass genomes like rice and Brachypodium, where proliferating LTRs are 
removed by illegitimate recombination, whereas large grass genomes 
like Sorghum and maize have solo:intact LTR ratios <1 (ref. 15). Despite 
its compact size, the Oropetium genome has a typical number of pre- 
dicted protein coding genes at 28,446. A pan-cereal whole-genome 
duplication (WGD) event, called rho, occurred before the diversi- 
fication of grasses>!°. There appear to have been no further WGDs 
in the selected grass genomes, including Oropetium, since the shared 
rho event*”. 

Genome alignments between Oropetium and selected grass genomes 
are mostly one-to-one after exclusion of the alignments derived from 
the shared genome duplication events (Extended Data Fig. 3a-e). 
Overall, 75% of the Oropetium genome, or 89% of its gene space, is 
contained in conserved syntenic blocks when compared to other 
grasses. Genomic colinearity across grass genomes is extensive, with a 
high density of orthologous genes spanning much of the euchromatin 
(Fig. 3). Insertions of retrotransposons and non-collinear genes that 
originated elsewhere in the genome contribute greatly to the differences 
in the intergenic sequences in grasses”°. 

The relative sizes of syntenic blocks in the grass genomes track 
closely with the overall genome size difference (Extended Data Fig. 3f). 


ao 1 2 Mb 


Contig 24 


~ CDS 

m= DNA-TE 
=LTR 

m= CenOt 


SE | BE Cent 
oe OP ie ee ee ee 

E Ee 2 ee 2 2) eee) ee Os  DNA-TE 
Ion ) 1 |) eee CDs 


bo 2 4 6 8 Mb 
J 


Contig 1 


~ CDS 


Figure 2 | SMRT sequencing enables contiguous sequencing over 
complex regions. The distributions of centromere-specific satellite DNA 
(CenOt), long terminal repeat retrotransposons (LTRs), DNA transposable 
elements (DNA-TE) and coding DNA sequences (CDS) are plotted. 

a, The gap-free assembly of a full-length centromeric array and the 
flanking highly repetitive pericentromeric region. b, The largest contig 
(7.8 Mb), which has a more typical distribution of elements. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 509 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Gene (+) 


PACMAD 
o 


Repeat Gene (-) 


Oropetium scaffold 3 
2.7-3.0 Mb 


Setaria chr 3 


ii teem efor ee Hei eee BiH eel 


Poaceae 
BOP 
an 


Figure 3 | Compact genome structure of Oropetium. Oropetium, part 

of the PACMAD clade, provides the first high-quality reference genome 
from the Chloridoideae subfamily—a large and diverse group of ~1,600 
species that contains the orphan crops tef (Eragrostis tef) and finger millet 
(Eleusine coracana). Typical micro-colinearity patterns among genomic 


In contrast, the genomic span of coding sequences is similar across 
genes that are retained in orthologous locations, although coding fea- 
tures are slightly smaller in Oropetium (Extended Data Table 4). The 
relatively constant sizes of coding sequences among grass genomes 
confirm that genome size differences are indeed due to variations in 
the intergenic contents. It was thought that plants have a ‘one-way 
ticket to genome obesity’ due to the retention of proliferating trans- 
posable elements~!. However, analysis of carnivorous plants Utricularia 
gibba (bladderwort, 82 Mb)” and Genlisea aurea (corkscrew, 63.6 Mb)” 
provided evidence that almost all intergenic space can be purged. Small 
genomes also arise from a reduction in gene number as seen in the 
aquatic monocotyledon Spirodela polyrhiza, which has the fewest pre- 
dicted protein coding genes at 19,623 (ref. 24). Oropetium seems to have 
reduced both its intergenic and intragenic sequence. 

As the intergenic sequence in Oropetium is specifically reduced com- 
pared with other grasses (Extended Data Fig. 3f), we determined which 
sequence accounted for its smaller genome size by comparing highly 
syntenic regions of the larger 730 Mb Sorghum genome. To identify 
highly orthologous regions we looked for Sorghum genes (promoter, 
5/UTR, exons, introns and 3’UTR) with an increased number of con- 
served noncoding sequences”. We then analysed the top 48 Sorghum 
genes against their orthologous sequences in Oropetium and found 
that they were 38% (+0.27, 1s.d.) larger in Sorghum (Extended Data 
Fig. 4a). The primary driver of gene-space expansion was highly unique 
~1-kb intragenic sequences evenly spaced within the Sorghum genes. 
One explanation is that these evenly spaced highly unique sequences 
are degenerate remnants of transposons that have been partly purged 
from the Sorghum genome. Oropetium has a >1 solo:intact LTR ratio, 
consistent with active purging of transposons and complete loss of 
these regions. These results lend support to an emerging theory about 
the C-value paradox called the Genome Balance Hypothesis”®, which 
suggests that selection on gene networks and pericentromeric growth 
(centromere movement) is balanced by transposon proliferation and 
retention. Therefore, these evenly spaced highly unique sequences 
balance the 6:1 expansion of pericentromeric sequence in Sorghum as 
compared to Oropetium (Extended Data Fig. 4b). 

Desiccation tolerance was a key adaptation that permitted the 
most recent common ancestor of terrestrial plants to survive on land. 
Desiccation tolerance is widespread in bryophytes and lichens but rare in 
flowering plants, although similar mechanisms have evolved in vascular 
plants for seed and pollen desiccation. Desiccation tolerance to survive 
prolonged drought evolved independently in diverse monocotyledon 
and eudicotyledon lineages, and is found in at least 300 species. Gene 
duplications have provided the raw material for evolutionary innova- 
tion across plants. Tandem duplicated genes are often involved in stress 
responses and are probably important for adaptive evolution in dynam- 
ically changing environments. Oropetium has 6,668 tandem duplicated 
genes in 2,326 clusters, which is a slightly higher number than in other 


510 | NATURE | VOL 527 | 26 NOVEMBER 2015 


20.7-21.4 Mb 


Sorghum chr 9 
49.4-48.5 Mb 


Oryza chr 5 
20.1-19.6 Mb 


APH beam +A lft et ep 


Brachypodium chr 2 
23.4-24.1 Mb 


regions from Oropetium, Setaria, Sorghum, Oryza and Brachypodium 
are shown. Rectangles show predicted gene models, and colours 
indicate relative orientations. Matching gene pairs are displayed as grey 
connections. chr, chromosome. 


grasses, but a similar proportion (24% of genes). Tandem duplicated 
genes are enriched for gene ontology terms involved in response to abi- 
otic stresses, gene regulation and cellular metabolism (Supplementary 
Table 2). In addition, Oropetium has 4,209 homeologous gene pairs 
retained from the rho WGD event, which are enriched for gene ontology 
terms related to gene regulation and stress responses such as transcrip- 
tion factor activity, nitrogen metabolism, response to abiotic stimulus, to 
salt stress and to oxygen-containing compounds (Supplementary Tables 
3 and 4). Understanding the genomic mechanisms of extreme desicca- 
tion tolerance in resurrection plants such as Oropetium may provide 
targets for engineering drought and stress tolerance in crop plants. 

Pacific Biosciences (PacBio) SMRT sequencing has been used 
to close gaps in the human genome”’, assemble complete bacterial 
genomes’” and identify novel gene isoforms”*. Here we present a several 
hundred megabase plant genome, sequenced and assembled entirely by 
SMRT sequencing. The long SMRT reads produced a near-complete 
draft genome that captured three of nine complete centromeres, all 
of the telomeres and biologically relevant features of the Oropetium 
genome. The total time from extracted DNA to a complete assembly 
was less than one month, and costs for PacBio were comparable to 
an Illumina-based genome assembly. Our study demonstrates that 
SMRT sequencing enables a new level of genome assembly required 
for full ENCODE-type analysis of intergenic sequence, which is not 
currently possible with other NGS-based methods. The compactness 
of the Oropetium genome results from purging of both inter- and intra- 
genic sequences, probably through small deletions during illegitimate 
recombination, as has been shown in other grasses. One hypothesis is 
that genome size is a function of cell size~’, and consistent with this, all 
small plant genomes sequenced to date including Arabidopsis (125 Mb), 
Brachypodium (272 Mb), Selaginella (100 Mb) Spirodela (158 Mb) and 
Utricularia (82 Mb) are plants of very small stature (Fig. 1). However, 
we provide evidence for the Genome Balance Hypothesis, which sug- 
gests that there is selective pressure on Oropetium to purge proliferat- 
ing transposons in order to maintain expression balance of networked 
genes and spacing in centromeres. The complete assembly of complex 
and highly similar repeat sequences demonstrated here suggests that 
SMRT sequencing can be used to assemble large and polyploid plant 
and other eukaryotic genomes, assuming ample sequence coverage and 
computational resources. SMRT-sequencing-based assemblies provide 
an opportunity to determine how these regions play a role in genome 
architecture and dynamics. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 28 April; accepted 10 September 2015. 
Published online 11 November 2015; corrected online 25 November 2015 
(see full-text HTML version for details). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Michael, T. P. & VanBuren, R. Progress, challenges and the future of crop 
genomes. Curr. Opin. Plant Biol. 24, 71-81 (2015). 

Kellis, M. et al. Defining functional DNA elements in the human genome. 
Proc. Nat! Acad. Sci. USA 111, 6131-6138 (2014). 

The Arabidopsis Genome Initiative. Analysis of the genome sequence of the 
flowering plant Arabidopsis thaliana. Nature 408, 796-815 (2000). 
International Rice Genome Sequencing Project. The map-based sequence of 
the rice genome. Nature 436, 793-800 (2005). 

Paterson, A. H. et al. The Sorghum bicolor genome and the diversification of 
grasses. Nature 457, 551-556 (2009). 

Elert, E. Rice by the numbers: A good grain. Nature 514, S50-S51 (2014). 
Schnable, P. S. et al. The B73 maize genome: complexity, diversity, and 
dynamics. Science 326, 1112-1115 (2009). 

nternational Barley Genome Sequencing Consortium. A physical, genetic and 
functional sequence assembly of the barley genome. Nature 491, 711-716 
(2012). 

nternational Wheat Genome Sequencing Consortium (IWGSC). A 
chromosome-based draft sequence of the hexaploid bread wheat (Triticum 
aestivum) genome. Science 345, 1251788 (2014). 


. The International Brachypodium Initiative. Genome sequencing and analysis of 


he model grass Brachypodium distachyon. Nature 463, 763-768 (2010). 


. Bartels, D. & Mattar, M. Oropetium thomaeum: A resurrection grass with a 


diploid genome. Maydica 47, 185-192 (2002). 


. Chin, C.-S. et al. Nonhybrid, finished microbial genome assemblies from 


ong-read SMRT sequencing data. Nature Methods 10, 563-569 (2013). 


. Lamesch, P. et al. The Arabidopsis Information Resource (TAIR): improved gene 


annotation and new tools. Nucleic Acids Res. 40, D1202-D1210 (2012). 


. Berlin, K. et a/. Assembling large genomes with single-molecule sequencing 


and locality-sensitive hashing. Nature Biotechnol. 33, 623-630 (2015). 


. El Baidouri, M. & Panaud, 0. Comparative genomic paleontology across plant 


kingdom reveals the dynamics of TE-driven genome evolution. Genome Biol. 
Evol. 5, 954-965 (2013). 


. Michael, T. P. Plant genome size variation: bloating and purging DNA. Brief. 


Funct. Genomic. 13, 308-317 (2014). 


. Jones, N. & PaSakinskiené, |. Genome conflict in the gramineae. New Phytol. 


165, 391-410 (2005). 


. Bennetzen, J. L. et al. Reference genome sequence of the model plant Setaria. 


Nature Biotechnol. 30, 555-561 (2012). 


. Tang, H., Bowers, J. E., Wang, X. & Paterson, A. H. Angiosperm genome 


comparisons reveal early polyploidy in the monocot lineage. Proc. Nat! Acad. 
Sci. USA 107, 472-477 (2010). 


. Wicker, T., Buchmann, J. P. & Keller, B. Patching gaps in plant genomes results 


in gene movement and erosion of colinearity. Genome Res. 20, 1229-1237 
(2010). 


. Bennetzen, J. L. & Kellogg, E. A. Do plants have a one-way ticket to genomic 


obesity? Plant Cell 9, 1509 (1997) 


. lbarra-Laclette, E. et al. Architecture and evolution of a minute plant genome. 


Nature 498, 94-98 (2013). 


. Leushkin, E. V. et al. The miniature genome of a carnivorous plant Genlisea 


aurea contains a low number of genes and short non-coding sequences. BMC 
Genomics 14, 476 (2013). 


. Wang, W. et al. The Spirodela polyrhiza genome reveals insights into its 


neotenous reduction fast growth and aquatic lifestyle. Nature Commun. 5, 
3311 (2014). 


LETTER 


25. Lyons, E. & Freeling, M. How to usefully compare homologous plant 
genes and chromosomes as DNA sequences. Plant J. 53, 661-673 
(2008). 

26. Freeling, M., Xu, J., Woodhouse, M. & Lisch, D. A solution to the C-value paradox 
and the function of junk DNA: the Genome Balance Hypothesis. Mol. Plant 8, 
899-910 (2015). 

27. Chaisson, M. J. P. et al. Resolving the complexity of the human genome using 
single-molecule sequencing. Nature 517, 608-611 (2015). 

28. Au, K. F. et al. Characterization of the human ESC transcriptome by hybrid 
sequencing. Proc. Natl Acad. Sci. USA 110, E4821-E4830 (2013). 

29. Beaulieu, J. M., Leitch, |. J., Patel, S., Pendharkar, A. & Knight, C. A. Genome size 
is a strong predictor of cell size and stomatal density in angiosperms. New 
Phytol. 179, 975-986 (2008). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work is supported in part by funding from the National 
Science foundation (DBI-1401572 to R.V.; DBI-120793 to PP.E.), USDA NIFA 
(CO471A-B to M.F.), the Department of Energy (DE-SC0012639 to T.C.M. and 
T.P.M.; DE-SC-0008769 to T.C.M.), the Donald Danforth Plant Science Center 

to T.C.M. and the Enterprise Rent-A-Car Institute for Renewable Fuels to T.C.M. 
Sequencing was provided by Pacific Biosciences under the ‘Most Interesting 
Genome in the World’ 2014 SMRT grant program. 


Author Contributions R.V., D.Br., T.P.M. and T.C.M. designed and conceived 
research; D.Ba. and D.C. identified biological material, performed desiccation 
experiments and extracted DNA and RNA; R.V. prepared DNA for PacBio 
sequencing; T.P.M., R.V. and T.C.M. performed Illumina sequencing; K.S., 

R.H. and J.G. performed PacBio sequencing and assembly; B.T.H. and A.H. 
conducted the BioNano analysis. D.Br., R.V., T.P.M. and T.C.M. annotated genome 
features; E.L., M.F, D.Bu., R.V., D.Br., H.T., T.P.M., T.C.M. and PPE. analysed data; 
R.V., T.P.M. and T.C.M. wrote the paper. All authors read and approved the final 
manuscript. 


Author Information The genome assembly and annotation have been 
deposited in CoGe under the accession code 25799 (https://genomevolution. 
org/CoGe/Genomelnfo.pl?gid=25799), in the NCBI BioProject under 
PRJNA286116, and in GenBank under accession number LFJQOOO00000. 
Raw PacBio and Illumina reads are available at the Short Read Archive at 
NCBI under the aforementioned NCBI BioProject. Genome assembly and 
annotation are also available at http://www.oropetium.org/. Reprints and 
permissions information is available at www.nature.com/reprints. The 
authors declare no competing financial interests. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to T.P.M. (toddpmichael@gmail.com) or T.C.M. 
(tmockler@danforthcenter.org). 


E)OSO This work is licensed under a Creative Commons Attribution- 

pane NonCommercial-ShareAlike 3.0 Unported licence. The images or 
other third party material in this article are included in the article’s Creative 
Commons license, unless indicated otherwise in the credit line; if the material 
is not included under the Creative Commons license, users will need to obtain 
permission from the license holder to reproduce the material. To view a copy 
of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ 


26 NOVEMBER 2015 | VOL 527 | NATURE | 511 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. 

Plant material. Oropetium thomaeum is a compact resurrection plant that has 
the smallest known genome among the grasses, at 245 Mb and 9 chromosomes 
(2n=2x= 18; 1C=0.25pg)'”. We estimated the genome size to be 250 Mb by 
flow cytometry and 245 Mb by k-mer analysis (Extended Data Fig. 1b). Oropetium 
thomaeum plants were originally collected in Jodhpur, Rajasthan, India and prop- 
agated as previously described''. Oropetium is a member of the Chloridoideae 
subfamily, a large and diverse group of roughly 1,600 species that contains the 
orphan crops tef (Eragrostis tef) and finger millet (Eleusine coracana) as well as 
some turf grasses (such as Bermuda grass, Cynodon dactylon and Zoysia japonica). 
SMRT PacBio sequencing. Fifty micrograms of high-molecular-weight Oropetium 
gDNA was extracted using a modified nuclei preparation method”? followed by 
an additional high-salt phenol-chloroform purification to minimize contamina- 
tion. A 20-kb insert SMRTbell library was generated using a 15 kb lower-end size 
selection protocol on the BluePippin (Sage Science). Initial titration runs were 
performed to optimize loading on the SMRT Cell for maximum performance. The 
Oropetium genome was sequenced using 32 SMRT Cells with 4-h collections and 
P6-C4 chemistry on the PacBio RS II platform (Pacific Biosciences). 

HGAP genome assembly. The Oropetium genome was assembled using the 
RS_HGAP_Assembly.3 protocol for assembly and Quiver for genome polish- 
ing in SMRT Analysis v2.3.0". This consisted of a three-step process involving 
(1) generation of preassembled reads with improved consensus accuracy; 
(2) assembly of the genome through overlap consensus accuracy using Celera; and 
(3) one round of genome polishing with Quiver. For HGAP, the following param- 
eters were used: PreAssembler Filter v1 (minimum sub-read length = 3,000 bp, 
minimum polymerase read quality = 0.80, minimum polymerase read 
length = 3,000 bp); PreAssembler v2 (minimum seed length = 16,000 bp, number 
of seed read chunks = 6, alignment candidates per chunk = 10, total alignment 
candidates = 24, min coverage for correction = 6); AssembleUnitig v1 (target 
genome coverage = 30, overlap error rate = 0.06, minimum overlap = 40 bp and 
overlap k-mer = 14); and BLASR v1 mapping of reads for genome polishing with 
Quiver (max divergence percentage = 30, minimum anchor size = 12). A second 
round of genome polishing was performed using Quiver (SMRT Analysis v2.3.0) to 
further improve the site-specific consensus accuracy of the assembly. The following 
Quiver parameters were used for genome polishing: filtering (minimum sub-read 
length = 3,000 bp, minimum polymerase read quality = 0.80, minimum polymer- 
ase read length = 3,000 bp); mapping (maximum divergence percentage = 30, 
minimum anchor size = 12). Default parameters were otherwise employed for both 
HGAP assembly and Quiver protocols. 

Falcon and MHAP assemblies. We also tested other assemblers to compare the 
PacBio HGAP assembly results (Extended Data Fig. 1i). Raw PacBio reads were 
error-corrected and assembled using Falcon and MHAP under default parame- 
ters. The Falcon and MHAP assemblies have lower contiguity than the HGAP 
assembly and have fewer assembled centromere and telomere sequences with a 
lower average length. 

Construction of a genome map using the Irys system for contig anchoring 
and scaffolding. Genome mapping from BioNano Genomics”! was used to 
improve the assembly quality of the Oropetium genome with the eventual goal 
of producing a chromosome-scale assembly. High molecular weight genomic 
DNA was isolated from fresh Oropetium tissue using the following protocol 
outline. Three grams of leaves were collected from live Oropetium thomaeum 
plants and fixed with formaldehyde. After blending with a tissue homogenizer 
in isolation buffer, a filtration step and Triton-X washing treatment were per- 
formed. The nuclei were purified on percoll cushions. The nuclei were washed 
extensively and embedded in low melting agarose at different dilutions. Finally, 
the DNA plugs were treated with a lysis buffer containing detergent, protein- 
ase K and 6-mercaptoethanol (BME). In total, 53 Gb of data (>100 kb) were 
collected representing ~200x genome coverage with a molecule N50 length of 
169 kb (Extended Data Fig. 5a). The size distribution was lower than expected 
and is probably a result of impurities during high-molecular-weight gDNA 
isolation that would cause shearing and inhibition of enzymes. Molecules were 
de novo assembled as previously described**. Two genome maps were assembled 
at different stringencies, map set 1 has 402 maps with an N50 length of 725 kb and 
spans 216 Mb (Extended Data Fig. 5b); the second genome map has 214 maps and 
an N50 of 1.674 Mb. Combining the genome maps with the PacBio assembly to 
produce a hybrid scaffold was performed sequentially with the two genome maps. 
The scaffolding merged 90 contigs producing an assembly of 46 primary scaffolds 
covering 94% of the sequence assembly with an N50 of 7.8 Mb; in total there are 
535 scaffolds with an N50 of 7.1 Mb and total assembled size of 244 Mb. 

Variant calling using Illumina data. WGS Illumina sequences from Oropetium 
gDNA were used to assess the error rate of the PacBio assembly and residual 
within-genome heterozygosity (Supplementary Table 5). Raw Illumina HiSeq data 


from three different libraries of 570-bp insert, 1-kb insert and 3-kb insert sizes 
were trimmed for quality using Trimmomatic (v.0.32; ref. 33). Illumina sequence 
adaptors were removed, leading low quality (below quality 3) and N base pairs 
were trimmed, and reads were scanned using a 4-bp sliding window and trimmed 
when the average quality per base dropped below 30. Read pairs where both reads 
were ultimately of at least 36 bp in length following this quality control process 
were retained and used for subsequent analyses. 

Quality trimmed data were aligned to our assembly using BWA mem (v. 
0.7.12-r1039)*4. Duplicate alignments were marked using Picard tools v.1.104 
MarkDuplicates (http://broadinstitute.github.io/picard/). Genome Analysis Toolkit 
(v.3.3.0)*° IndelRealigner was used to perform local realignment around indels, 
followed by application of GATK HaplotypeCaller to call variants. Identified single 
nucleotide polymorphisms were filtered by depth, strand bias, mapping quality and 
read position. Identified indels were filtered by depth, strand bias and read position. 

The native error rate of raw PacBio reads is in the range of 15-20%, raising 

the possibility that residual sequencing errors may be introduced into the final 
assembly of the Oropetium genome. Homozygous mismatches are classified as 
sequencing errors, and heterozygous mismatches indicate sites of heterozygosity. 
The accuracy rate is very high at 99.99995%, and a relatively high proportion of the 
errors (two-thirds) are small insertions or deletions (indels). The accuracy rate is 
similar to those obtained with WGS Sanger approaches** and is higher than those 
reported for most NGS-based assemblies. The estimated residual within-genome 
heterozygosity for the Oropetium genome is very low at 0.087%, which probably 
contributed to the high contiguity of the assembly. This suggests that provided 
sufficient coverage, a PacBio SMRT-only approach can produce a high-quality 
complete plant genome. 
Repeat annotation. To structurally annotate repeat sequences in the Oropetium 
genome, we began by discovering repetitive elements through application of the 
REPET v.2.2 packages TEdenovo and TEannot’”. The TEdenovo pipeline compares 
the genome with itself to identify and classify repeated genomic elements. 
All-by-all alignments were conducted with NCBI-BLAST+ using default 
TEdenovo parameters. LTRharvest** was used for structural detection. During 
clustering, Grouper, Recon and Plier steps were invoked both with and without 
structural detection. Consensus building was performed using default parameters. 
During consensus detect features, repeat scout?” was invoked, and Pfam26.0 HMM 
profiles’ and Repbase (v18.08) nucleotide and amino acid databanks were used. 
Finally, consensus classification, filtering and clustering were performed using 
default parameters. 

Output from the TEdenovo pipeline was used as input to the TEannot pipeline. 
This pipeline mines the genome sequence using repeated sequences identified in the 
previous TEdenovo pipeline to produce classified non-redundant consensus repeat 
sequences along with short simple repeats, which are exported to GFF3 format. 
First, a set of perfectly matching sequences from the TEdenovo-output transposable 
elements (TE) library was selected by running a subset of the TEannot pipeline, pro- 
ducing a working reference TE library. This TE library was used in a full run of the 
TEannot pipeline. For alignment of the reference TE library, NCBI-BLAST+ was 
used, and blaster, repeat masker and censor steps were run both on the reference TE 
library and on randomized chunks. Filtering was applied using default parameters. 
Short simple repeats were identified using the crossmatch engine. Merging was 
performed using default parameters. For comparisons, Repbase (v18.08) nucleotide 
and amino acids databanks were used. Finally, filtering was applied using default 
parameters, and annotations were exported to GFF3 format. 

To classify identified repeats, non-redundant consensus repeat sequences 
as output by TEanno were annotated via PASTEClassifier v1.0 https://urgi. 
versailles.inra.fr/Tools/PASTEClassifier/README). To classify these sequences, 
Repbase (v18.08)*! nucleotide and amino acid sequences were used, as were 
Pfam v26.0 (http://pfam.xfam.org/) HMM repeat profiles. Finally, identified 
LTRs were classified as Gypsy if homology or motif evidence existed for Gypsy 
and not for Copia, classified as Copia if the opposite were true, and otherwise 
classified as unknown. 

Centromere and telomere identification. Centromeric repeats were identified 
using an approach outlined in ref. 42. Tandem repeat finder (TRF, Version 4.07b)* 
was used to find tandem repeats using the parameters ‘1 1 2 80 5 200 2000-d 
-h in order to find high order repeats. The resulting ‘dat file was transformed 
into a GFF3 file, which was used to identify telomeric and centromeric repeats. 
To identify the centromeric repeats, the largest repeat arrays (period length X 
copy number) were identified and clustered. Clustered centromeric repeat regions 
were transformed into FASTA files and aligned using clustalX to identify array 
sequence composition and orientation. The base centromere repeat was 155 bp 
dimers (310 bp), trimers (465 bp) and tetramers (620 bp) (Extended Data Fig. 2 
and Supplementary Table 1). The three largest centromeric arrays (contigs 003, 
028 and 064) were >400 kb and resolved into large inverted repeats, consistent 
with them being full length. The telomeric repeats were identified by searching 


© 2015 Macmillan Publishers Limited. All rights reserved 


the ends of contigs for short (~7 bp) high copy number repeats; 18 telomeric repeat 
sequences with the monomer AAACCCT’ were identified (Extended Data Table 1). 
Transcriptome assembly. Total RNA was extracted from fresh, desiccated and 
24-h post rehydration Oropetium leaf tissues with 2 biological replicates collected 
for each tissue. RNA-seq libraries were prepared from the total RNA and bar-coded 
using TruSeq RNA Sample Prep Kits (Illumina) according to the manufacturer's 
protocol. Raw Illumina RNA-seq data from the six libraries were trimmed for qual- 
ity using Trimmomatic (v.0.32; ref. 33). Illumina sequence adaptors were removed, 
then leading low-quality (below quality 3) and N base pairs were trimmed and, 
finally, resulting trimmed reads were scanned using a 4-bp sliding window and cut 
when the average quality per base dropped below 30. Read pairs where both reads 
were ultimately of at least 36 base pairs in length following this quality control pro- 
cess were retained and used for subsequent analyses. Trinity (v.r20140717)"* was 
used to assemble quality filtered data. Assembled transcripts were aligned to our 
genome sequence using NCBI blastn v.2.2.30+ with an e-value cut-off of 1 x 10°. 
Successfully aligned transcripts were clustered at 90% identity using CD-HIT 
(v. 4.5.4), with representative sequences from each cluster retained and used to 
help parameterize gene calling. Eighty-seven per cent of the trimmed RNA-seq 
reads aligned to the Oropetium genome, suggesting that the genome is largely 
complete (Supplementary Table 5). Reads that failed to align may have been 
contaminants from other organisms. 

Gene annotation. Maker v2.31.8*° (http://www.yandell-lab.org/software/maker. 
html) was used to identify putative genes. Aligned and representative sequences 
from our transcriptome assembly were input to Maker as expressed sequence tag 
evidence. Rice and Brachypodium proteome sequences clustered at 90% iden- 
tity using CD-HIT (v. 4.5.4)*° with representative sequences from each cluster 
retained and input to Maker as multi-organismal protein homology evidence. 
The Oropetium repeat database was input to Maker as a custom repeat library. 
SNAPhmm, Augustus, and GeneMarkHMM were invoked by Maker and were 
initially trained using rice and maize. Only genes for which the encoded protein 
was predicted to contain a complete open reading frame were retained. 

On the basis of the gene annotations provided by Maker, cufflinks (v2.2.1)"’ was 
used to identify predicted genes without empirical expression evidence. Quality- 
trimmed data from all six RNA-seq libraries were input simultaneously to cufflinks, 
with results used to identify genes with and without expression. 

Protein sequences from genes predicted by Maker were functionally annotated 
using NCBI blastp v.2.2.30+ versus the NCBI non-redundant refseq protein data- 
base (http://www.ncbi.nlm.nih.gov/refseq/), versus the UniProt database**, and 
using InterProScan (v. 5.6-48.0)”. 

Finally, Maker-predicted genes were pruned based on a Maker-defined anno- 
tation edit distanced (AED) score that measures distance between the predicted 
gene and the evidence input to Maker, non-redundant (NR) annotation, Uniprot 
annotation, InterProScan annotation and expression level as output by cufflinks. 
Genes were removed that had no alignment evidence (AED = 1), no sequence 
match to either the NR or Uniprot databases, no InterProScan predicted domains 
and no expression evidence in our RNA-seq data. 

Synteny and comparative genomics. Genome data sets from Setaria, Sorghum, 
rice and Brachypodium were downloaded from Phytozome (version 9.1) and 
subject to pairwise genome alignments against the Oropetium genome. For each 
pairwise alignment, the coding sequences of predicted gene models are compared 
to each other using adaptive seeds”. Our synteny search pipeline defines syntenic 
blocks by chaining the large-scale alignment tool (LAST) hits with a distance cut- 
off of 20 genes apart, also requiring at least four gene pairs per syntenic block. 
The syntenic blocks were further screened using QUOTA-ALIGN* to retain one- 
to-one blocks and to exclude weak blocks derived from shared ancient duplications. 
The resulting dot plots were visually inspected to confirm the structural similarity 
of the Oropetium genome in relation to other genomes (Extended Data Fig. 3a-e). 

Pairwise genomic alignments, described above, combined with OrthoMCL*? 
analyses filtered to one-to-one hits were used to identify orthologous gene clusters 
between Oropetium and Sorghum, rice, Vitis and Arabidopsis. The complete 
Oropetium—Arabidopsis orthologue list was then filtered to focus on genes with 
functional data in the STRING v9.1 global Arabidopsis protein interaction 
network°?, Gene expression patterns and duplicated genes (tandem and whole- 
genome duplicates) were mapped onto this network using Cytoscape v3.1.1°* 


y? 


LETTER 


to identify clusters of co-expressed and interacting duplicate genes, respectively 
(Extended Data Fig. 6). Various network statistics were calculated using 
NetworkAnalyzer™, including average number of neighbours (that is, protein inter- 
actions) and total number of isolated nodes (that is, without known interactors). 
Constructing a gene interaction network. We constructed a gene interaction 
network for Oropetium on the basis of orthologous relationships with Arabidopsis 
genes with validated interactions and expression data yielding a network with 4,421 
nodes (gene products) with 36,918 edges (interactions). This network encompasses 
most metabolic pathways including photosynthesis, core anabolic and catabolic 
processes and stress response pathways (Extended Data Fig. 6). 


30. Zhang, H.-B., Zhao, X., Ding, X., Paterson, A. H. & Wing, R. A. Preparation of 
megabase-size DNA from plant nuclei. Plant J. 7, 175-184 (1995). 

31. Lam, E. T. et al. Genome mapping on nanochannel arrays for structural 
variation analysis and sequence assembly. Nature Biotechnol. 30, 771-776 
(2012). 

32. Cao, H. et al. Rapid detection of structural variation ina human genome 
using nanochannel-based genome mapping technology. GigaScience 3, 

34 (2014). 

33. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for 

Illumina sequence data. Bioinformatics 30, 2114-2120 (2014). 

34. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows— 

Wheeler transform. Bioinformatics 25, 1754-1760 (2009). 

35. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for 

analyzing next-generation DNA sequencing data. Genome Res. 20, 1297-1303 

(2010). 

36. Ming, R. et al. The draft genome of the transgenic tropical fruit tree papaya 

(Carica papaya Linnaeus). Nature 452, 991-996 (2008). 

37. Flutre, T., Duprat, E., Feuillet, C. & Quesneville, H. Considering transposable 

element diversification in de novo annotation approaches. PLoS ONE 6, 

6526 (2011). 

inghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible 
software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 
18 (2008). 

39. Price, A. L., Jones, N.C. & Pevzner, P. A. De novo identification of repeat families 
in large genomes. Bioinformatics 21, i351-i358 (2005). 

4O. Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, 
D222-D230 (2014). 

41. Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. 
Cytogenet. Genome Res. 110, 462-467 (2005). 

42. Melters, D. P. et al. Comparative analysis of tandem repeats from hundreds of 
species reveals unique insights into centromere evolution. Genome Biol. 14, 
R10 (2013). 

43. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. 
Nucleic Acids Res. 27, 573 (1999). 

44. Grabherr, M. G. et al. Full-length transcriptome assembly from 
RNA-Seq data without a reference genome. Nature Biotechnol. 29, 644-652 
(2011). 

45. Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT Suite: a web server for 
clustering and comparing biological sequences. Bioinformatics 26, 680-682 
(2010). 

46. Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for 
emerging model organism genomes. Genome Res. 18, 188-196 (2008). 

47. Trapnell, C. et al. Differential gene and transcript expression analysis of 
RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7, 562-578 
(2012). 

48. Wu, C. H. et al. The Universal Protein Resource (UniProt): an expanding 
universe of protein information. Nucleic Acids Res. 34, D187—D191 (2006). 

49. Quevillon, E. et a/. InterProScan: protein domains identifier. Nucleic Acids Res. 
33, W116-W120 (2005). 

50. Kietbasa, S. M., Wan, R., Sato, K., Horton, P. & Frith, M. C. Adaptive seeds tame 
genomic sequence comparison. Genome Res. 21, 487-493 (2011). 

51. Tang, H. et al. Screening synteny blocks in pairwise genome comparisons 
through integer programming. BMC Bioinformatics 12, 102 (2011). 

52. Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups 
for eukaryotic genomes. Genome Res. 13, 2178-2189 (2003). 

53. Franceschini, A. et a/. STRING v9. 1: protein-protein interaction networks, with 
increased coverage and integration. Nucleic Acids Res. 41, D808-D815 
(2013). 

54. Saito, R. et al. A travel guide to Cytoscape plugins. Nature Methods 9, 
1069-1076 (2012). 

55. Doncheva, N. T., Assenov, Y., Domingues, F. S. & Albrecht, M. Topological 
analysis and interactive visualization of biological networks and protein 
structures. Nature Protocols 7, 670-685 (2012). 


mo 


38. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Subreads 
g 
3 
o 


Mean read length =12,872 bp 


~~ 
N 
So 
=) 
o 


NS5O= 16,485 bp 


Mb > Subread Length 


20000 30000 
Subread Length 
C 

Raw Input: 
Mean Subread Length 12,872 bp 
N50 (Subread Length) 16,485 bp 
Total Number of 18,022,966,707 
sequenced Bases bp 
Number of Reads 1,400,150 


HGAP Preassembly (BLASR): 


Seed length cutoff 16,000 bp 
Pre-Assembled Bases ao 
Pre-Assembled Reads 464,567 
Pre-Assembled N50 18,572 bp 


Output (Celera Assembler): 


mere of Polished 625 

Max Contig Length 7,984,151 bp 
N50 Contig Length 2,386,328 bp 
Sum of Contig Lengths 243,174,629 bp 


% repeats 


GC content 


b 


4,500,000 
4,000,000 
3,500,000 
3,000,000 


2,500,000 


Frequency 


2,000,000 
1,500,000 
1,000,000 
500,000 
0 


0 20 40 60 80 100 120 140 
Coverage 


a 


= — 
® 
So 
re) 
oS 
3 8 
x 8 
I 
——— 
° —x — 
— T T T T 7 
e 0 50 100 150 200 250 300 350 
—as Contig N50 (kb) 
3 
> 8 
c 
3 0 
S + 
iva 
o 
N 
° nd 
f T T T T T T 1 
0 10000 30000 50000 70000 


Scaffold NSO (kb) 


Teme er 8 en ce a 


Hl aly lh Ma Nally ua yt Uay Muse ahbdaa 


position along contig 


Errors in PacBio assembly 


i 


HGAP Falcon MHAP 
Assembly Size 244Mb 299Mb 240Mb 
Contig N50 length 2.4Mb 1.9Mb 1.1Mb 
# Contigs 625 612 941 
Largest Contig 7.9Mb 7.0Mb 6.3Mb 


Number of homozygous SNPs 5,729 
Number of homozygous InDels 10,687 
Estimated accuracy 99.9999% 
Within genome heterozygosity 

Number of heterozygous SNPs 172,250 
Number of heterozygous InDels 40,249 
Estimated within genome heterozygosity 0.09% 


Extended Data Figure 1 | Summary of the Oropetium genome assembly 
statistics. a, Histogram of length distribution of raw P6C4 chemistry 
PacBio reads. The mean read length of the raw reads is 12,872 bp, and the 
N50 is 16,485 bp. b, Genome size estimation using k-mer distribution. 
K-mer distribution of unassembled Oropetium Illumina WGS reads. 
K-mer frequency displays a unimodal curve indicating a low rate of 
heterozygosity in the Oropetium genome. Frequency distribution suggests 
a genome size of ~245 Mb, consistent with flow-cytometry-based 
estimations. c, SMRT sequencing raw read, preassembly and assembly 
statistics. d, e, The distribution of the contig N50 length (d) and scaffold 


N50 length (e) of all published plant genomes is plotted. The average 
contig N50 length for published plant genomes is ~50 kb compared to 
2.4Mb for Oropetium. f, g, Repeat density (as a function of percentage 
repeats) (f) and GC content (g) are plotted at a scaled position along each 
contig. Each contig was divided into 5,000 sliding windows with each 
window representing 0.02% of the contig length and the averages of each 
scaled sliding window are plotted. Repeat content and GC content do 

not vary at the ends of contigs. h, Estimated accuracy of SMRT PacBio 
assembly and within-genome heterozygosity. i, Comparison of HGAP 
Falcon and MHAP PacBio assemblers. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 1600 


= + 
> 1400 “ 
Monomer 155 bp g 1200 
Dimer 310 bp : seni 
¢ 800 e 
© 600 
Trimer 465 bp = 400 ~ 
s 200 * 4¢ @ 
S$ of oo % combo 
Tetramer 620 bp ” 0 20 40 60 80 100 
Match identity (%) 
> Se ear fa Ga Gee Gee ee ee 


+ teeteee 


L_| 
705 TR2711 TR2712 TR2714 TR2719 TR2723 TR2726 4 TR2727 TR2728 1TR2732 1TR2743  TR2753 TR2755 TR2771 TR2779 TR2782 1TR2784 
ERDD BLE ff 8 : B j4FSebP I BL GETIE FT OB BIBER ME ULE ' ID mio 6b OD 
region1238 repest_region1248 repeat_region1253 repeat_region1261 repeat_region1268 repeat_region1276 repeat_regioni284 repeat_region1287 repeat_regioni291 repeat_region1300 
psoas ll sgaee Ompeten_ 20160105 20887 a (REEL) | LEER On REE | Sec a EES Sea ENSS | Lae SSE | 
Extended Data Figure 2 | PacBio sequencing and assembly completely identity between monomers in the repeat decreases. c, The inverted repeat 
resolves the Oropetium centromeres. a, The Oropetium centromere structure of the entire centromere on contig028 with a 60 kb spacer (blue 
repeat base is 155 bp (red arrow), whereas they are also found in dimer box); arrows are as in a. d, Consensus 155 bp centromere monomer. 
(310 bp, grey arrow), trimer (465 bp, black arrow) and tetramer (620 bp, e, Integrated genome browser view of centromere repeat, LTRs and 


white arrow) form. b, As the copy number of a repeat increases, the match _ predicted genes on contig028. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


oropetium 


ae ath 
S t 
0 
N 
. 
5k . 
10K 
‘ 
i= . 
= = 
‘D> s 
o 
a 4 
o 
6 15k 
2 ‘ 
\ 
s 
4 
20K . 
’ 
= ‘ 
25K . 
( 5K 
10 
€ 
= 
@ 
Qa 
2 
0 15) 


Extended Data Figure 3 | Macrosynteny patterns and comparative 
genomics between the grasses. a—e, Macrosynteny of Oropetium versus 
Oropetium (a); Oropetium versus Brachypodium (b); Oropetium versus 
rice (c); Oropetium versus Setaria (d); and Oropetium versus Sorghum (e). 
f, Genome compaction in Oropetium compared to related grass genomes. 


sorghum 


b ‘ ay o ge 
oO Wa 
SS ¢ 
/ a ‘ 
5K = 
/ 
x 
\ ee 
ot / s —- 
he sy 
—_— s 
* bed \ 
a \ x4 
fo} 15K \ 
> \ 
x 
< 
4 
s ‘ 
oa 
20K =< ‘ xs eo 
mia | 
= = . : 
25K * 7 = - 5 
20K ) 10K 5K 20K 
d brachypodium 
a Y & e 2 & e & 
N 
. ‘a ¢ 
\ BR * 2 
7 ~) . : ce ’ 
‘4 “ - es C007 
x C008 
+ ; 5 
x 
A 
N a z 
10K ae x ~ 
a & a 
§ — . 
9 2 “ 
S Z . 
6 1K X 04 
‘ x 
: x 
7 cdg = , = 
od Chas . ne , 
as *. . e 
— 20K = . = 
. ’ 
s = *. 
“ = . Ng 
= 25K = ae . E s.] 
30K c 5K 10K 15K 20K 25K 30K 35K 40K 
setaria 
Oropetium Setaria Sorghum Rice Brachypodium 
Assembly 406 Mb 739 Mb 374 Mb 
size za3-Mb (1.7x) (3.0x (ay 272 Mb Max) 
Syntenic 353 Mb 588 Mb 295 Mb 
182 Mb 251 Mb (1.4: 
block span (1.9x) (3.4x (1.6x) fs) 
Syntenic 43.6 Mb 42.6 Mb 45.0Mb 46.4Mb 50.4 Mb 
gene span : (1.0x) (1.0x’ (1.1x) (1.2x) 
Syntenic CDS 15.8 Mb 17.0 Mb 16.6Mb 16.5Mb 16.5 Mb 
span . (1.1x) (1.1x (1.0x) (1.1%) 
Syntenic 24.8 Mb 21.4 Mb 25.0Mb 23.3Mb 30.1 Mb 
intron span : (0.9x) (1.0x (0.9x) (1.2x) 


brackets. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Syntenic block span is based on regions that show conserved synteny 
across all five genomes. Syntenic gene and coding DNA sequences span is 
based on 13,683 genes that are retained as genes in orthologous locations 
across all five genomes. The ratio compared to Oropetium is given in 


LETTER 


Oropetium thomaeum (Maker v2.2, hardmasked by RepeatMasker) Oropetium_20150105_ 16919 (chr: Oropetium_genomic_20141112_061 528855-547211) Reverse Complement 


Pret cai: eas 


b EDJ1 EDJ2 EDJ3 EDJ4 EDJS 


Oropetium 
i 


SE 


| tee eee, 


se) 
Sorghum 

Extended Data Figure 4 | Expansion of intragenic and pericentric annotated at the bottom in black. b, Pericentric region expansion in 
regions in Sorghum compared to Oropetium. a, A GEvo sequence Sorghum compared to Oropetium. A syntenic dot plot of the Sorghum 
similarity graphic of an Oropetium gene (upper) and its orthologous and Oropetium genomes is plotted. Oropetium contigs are ordered based 
Sorghum gene (lower). Blast hits (high-scoring segment pairs) are denoted —_ on synteny with Sorghum. Hits are coloured based on K, divergence, 
by red rectangles, and syntenic hits are connected by a red line. The with purple blocks corresponding to 1:1 orthologous regions and other 
green rectangles on the model line of Sorghum are conserved noncoding colours corresponding to retained genes from the rho and sigma WGDs. 
sequences (CNS) computed between Sorghum and rice; the expanse of Pericentric regions in Sorghum have few syntenic matches to Oropetium, 
CNS coverage defines ‘gene space. Within the oval are three CNS that suggesting that much of the expansion occurred in pericentric regions. 


may be spatially constrained. The expanded interspersed sequences are 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


. Molecule Size Distribution 
09 
08 
o7 
06 


03 

02 

Oo 

° 
” 109 0 200 20 300 30 400 4 br) 3% Cs) Co) 
Maleosle Length (KE) 

Extended Data Figure 5 | Assembly improvement using a BioNano- the genome assembly. Overlap between the PacBio-based contigs and the 
based genome map from the Irys system. a, Distribution of molecule size | genome map. Each line shows a single PacBio contig in green; genome 
for raw single-molecule genome mapping data. Size of single molecules maps are shown in light blue. 


in nanochannel arrays is plotted. b, Integration of the genome map with 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Frequency 
5 
3 


0 10 2 30 40 SO 60 70 80 9 100 110 120 130 140 150 
Number of shared neighbors 


azxsasxss 


Avg. neighborhood connectivity 
Sous 


10 
Number of neighbors 


Extended Data Figure 6 | Network statistics for tandem duplicated genes. a, Tandem duplicated genes in the metabolic network are shown in pink. 
b, Distribution of shared neighbours. c, The average number of neighbours. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Telomere repeat (AAACCCT) locations and organization in the Oropetium genome 


r Start of End of Size of Teleomeric _ Position of 
Contig Name conus telomeric telomeric centromeric repeat Telomere on aber ae 
coat i array array array (bp) sequence contig PIERCE AS 

Oropetium_genomic_143 99,304 4 6,446 6,445 AACCCTA start 910.1 
Oropetium_genomic_058 — 1,564,795 1,560,696 1,564,795 4,099 AGGGTTT end 580.9 
Oropetium_genomic_552 22,498 18,622 22,498 3,876 GTTTAGG end 562.9 
Oropetium_genomic_043 1,920,679 4 3,643 3,642 CCCTAAA start 515.7 
Oropetium_genomic_050 1,822,802 4 3,223 3,222 CCTAAAC start 453.3 
Oropetium_genomic_ 125 248,855 4. 3,182 3,181 AAACCCT start 452.4 
Oropetium_genomic_027 2,706,558 2 2,092 2,090 AAACCCT start 301.9 
Oropetium_genomic_169 56,172 54,243 56,170 1,927 TTAGGGT end 279.7 
Oropetium_genomic_103 526,141 524,277 = 526,139 1,862 GTTTAGG end 265.9 
Oropetium_genomic_124 262,476 260,617 262,476 1,859 TTTAGGG end 264 
Oropetium_genomic_090 736,395 al 1,601 1,600 CCTAAAC start DTA. 
Oropetium_genomic_010 = 4,141,579 4,140,107 4,141,579 1,472 TTAGGGT end 208.7 
Oropetium_genomic_076 1,024,162 4 1,169 1,168 CCCTAAA start 166.1 
Oropetium_genomic_493 25,796 24,869 25,795 926 GGGTTTA end 129.9 
Oropetium_genomic_136 = 153,270 = 152,446 ~—- 153,270 824 GTTTAGG end 119.1 
Oropetium_genomic_155 63,826 63,040 63,826 786 TTTAGGG end 110.4 
Oropetium_genomic_019 3,122,409 1 347 346 AAACCCT start 48 
Oropetium_genomic_149 80,145 1 294 293 AAACCCT start 40.4 


© 2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 2 | rRNA tandem array locations and organization in the Oropetium genome 


Size of 
Contig coe ot ag NOR Position wl 
Contig Name Size tandem ofNORon 
(bp) array array annay caanite tandem 
(bp) (bp) (bp) repeats 
51,716 er 
Oropetium_genomic_182 : il 51,716 51,716 contig 5.7 
; : spans 
Oropetium_genomic_265 38,885 1 38,885 38,885 cantix 43 
: . spans 
Oropetium_genomic_168 56,772 1 56,772 56,772 contig 63 
Oropetium_genomic_192 48,530 1 42,860 42,860 start 4.7 
Oropetium_genomic_214 44,298 31,977 44,298 12,321 start 1.3 
Oropetium_genomic_539 23,633 1 20,975 20,975 start 2.3 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


Extended Data Table 3 | Repeat annotation of the Oropetium genome 


Number _- Percent 


Repeat Class of Base Pairs 
Elements Covered 
Retrotransposon 214,698 35.60% 
Long terminal repeat (LTR) 107,010 25.50% 
Gypsy (RLG) 83,872 21.80% 
Copia (RLC) 18,223 36.90% 
Penelope (RPX) 1,548 0.15% 
Unknown LTR (RLX) 3,367 0.44% 
LINE (RIL) 17,399 1.90% 
SINE (RSX) 2,735 0.07% 
DIRS (RYD) 5,098 3.00% 
Unknown retrotransposon 
(RXX) P 82,456 7.50% 
DNA transposon 69,217 8.50% 
Maverick (DMX) 68 0.01% 
TIR (DTX) 41,930 6.60% 
Unknown DNA transposon 
(DxX) P 27,219 1.90% 
No category 7,902 1.00% 


Total 291,817 43.80% 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 4 | Comparisons of repeats and coding features in the monocotyledons 


Transcript statistics Exon statistics Intron statistics 
Common chr, 8enome repeat Avg, Median ae Mec Avg Median Avg Median 
wane Species name ‘ size # Gene # Len th Lensth Num Num Count Len th Lensth Count Len th Lenath 
(Mb) 8 et Exons Exons et gt gt 8 
Greater Spitodels 20 150 23 19,519 4,718 3,015 5.22 3 101,867 222 129 82,368 757 202 
duckweed __ polyrhiza 

Oropedium Oropetium 9 250 43 28446 2,729 1,928 4.55 3 129,421 210 126 100,975 446 168 
thomaeum 

brachy Braciypodient: = 272 21 42,868 3,819 3,128 5.38 4 154,738 254 137 120,380 402 142 
distachyon 

rice Oryza sativa 12 403 35 66,338 3,191 2,701 4A 3 238,247 331 162 177,497 389 166 

setaria Setaria italica 9 510 40 29,448 3,299 2563 4.96 3 134,802 261 137 106,488 436 145 

sorghum oc 10 818 62 40,599 2,745 2189 4.74 3 160,151 252 140 122,497 326 133 

corn Zea mays 10 ~—-.2,300 85 63,540 4,236 2,747 4.6 3 203,643 238 133. 149,177 670 154 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15763 


Sweet and bitter taste in the brain of awake 


behaving animals 


Yueqing Peng!?°, Sarah Gillis-Smith'?*, Hao Jin!??, Dimitri Trankner!*4, Nicholas J. P. Ryba° & Charles S. Zuker!*3-4 


Taste is responsible for evaluating the nutritious content of food, 
guiding essential appetitive behaviours, preventing the ingestion of 
toxic substances, and helping to ensure the maintenance of a healthy 
diet. Sweet and bitter are two of the most salient sensory percepts 
for humans and other animals; sweet taste allows the identification 
of energy-rich nutrients whereas bitter warns against the intake of 
potentially noxious chemicals’. In mammals, information from 
taste receptor cells in the tongue is transmitted through multiple 
neural stations to the primary gustatory cortex in the brain. Recent 
imaging studies have shown that sweet and bitter are represented 
in the primary gustatory cortex by neurons organized in a spatial 
map**, with each taste quality encoded by distinct cortical fields’. 
Here we demonstrate that by manipulating the brain fields 
representing sweet and bitter taste we directly control an animal’s 
internal representation, sensory perception, and behavioural 
actions. These results substantiate the segregation of taste qualities 
in the cortex, expose the innate nature of appetitive and aversive 
taste responses, and illustrate the ability of gustatory cortex to 
recapitulate complex behaviours in the absence of sensory input. 

In mice, sweet and bitter activate cortical fields in the insula (taste 
cortex) that are separated topographically by approximately 2mm 
(ref. 4) (Fig. la and Extended Data Fig. 1). We hypothesized that if 
these cortical fields represent sweet and bitter percepts, their direct 
activation would evoke ‘bitter and sweet sensation’ even in the absence 
of an actual bitter or sweet stimulus. To optogenetically control activa- 
tion of the gustatory cortex, we introduced channelrhodopsin° (ChR2) 
to the insula of wild-type mice by stereotaxic injection of adeno- 
associated virus (AAV) targeted to either the bitter or the sweet corti- 
cal field (see Fig. 1a, b, Extended Data Fig. 1, Supplementary Table 1 
and Methods for details). Single unit recordings of the insular cortex 
of transduced animals demonstrated that photostimulation evoked 
reliable neuronal firing that was phase locked to light delivery (Fig. 1c 
and Extended Data Fig. 1b). 

We reasoned that optogenetic activation of the sweet cortical field 
should trigger behavioural attraction, whereas stimulation of the bit- 
ter field should cause strong behavioural avoidance. We used a place- 
preference test® where animals expressing ChR2 in the sweet cortex 
were introduced to a two-chamber arena in which presence in one 
of the two chambers was coupled to optogenetic stimulation, in the 
absence of any reward or punishment; we then determined the ani- 
mal’s preference index as a measure of the time spent in the chamber 
that was coupled with light stimulation. When the sweet cortical field 
was stimulated, animals developed strong preference for the chamber 
coupled to ChR2 stimulation (Fig. 1d and Extended Data Fig. 2). This 
preference could be transferred to either side of the arena by switching 
the chamber coupled to the laser stimulation of sweet cortex (Fig. 1d, 
compare chamber 1 versus chamber 2). When the same sets of experi- 
ments were performed in animals expressing ChR2 in the bitter cortical 
field, mice now displayed a range of unconditioned aversive behaviours 


(see next section), and after just a few sessions strongly avoided the 
chamber linked to photostimulation (Fig. le). Mice injected with a 
control AAV expressing enhanced green fluorescent protein (AAV- 
eGFP construct) exhibited no significant place preference after laser 
stimulation of either the sweet or bitter cortical fields (Extended Data 
Fig. 2b). Together, these observations demonstrate that neurons in the 
sweet and bitter cortical fields drive attractive and aversive responses, 
respectively. 

Next, we examined if activation of the bitter and sweet cortical fields 
evokes classical taste behaviours’. We hypothesized that optogenetic 
activation of the bitter cortical field should trigger strong light-depend- 
ent suppression of licking, while activation of the sweet cortical field 
should trigger appetitive responses. 

We used a behavioural test where motivated animals (thirsty) were 
trained to lick water in response to a combination visual/tone cue in a 
head-restrained set-up® (see Methods). We then subjected the trained 
animals expressing ChR2 in the bitter cortical field to testing sessions 
consisting of a series of water-only trials, but in half of the trials the 
bitter cortical field was stimulated upon contact of the tongue with 
the water spout. 

During the entire session we imaged (facial features), recorded, and 
measured licking responses. Figure 2 demonstrates that when the bit- 
ter cortical field was stimulated, there was a dramatic suppression of 
licking behaviour (see also Supplementary Video 1), with the animal's 
response closely following the ChR2 activation of the bitter cortex. 
Notably, after strong laser stimulation (10-20 mW), the animals dis- 
played prototypical taste rejection orofacial responses, sometimes 
including gagging (gaping’), and attempts to clean and rid the mouth 
of the non-existent bitter tastant (Supplementary Video 1; see legend 
for details). 

What about the sweet cortical field? A characteristic feature of 
sweet taste is that non-thirsty animals remain robustly attracted to 
sweet solutions, even though they exhibit limited interest for water!”. 
Therefore, we predicted that a mildly water-satiated animal express- 
ing ChR2 in the sweet cortical field would still show little attraction 
for water in control trials (referred to as off-trials), but would exhibit 
significantly enhanced licking during water trials coupled to laser 
stimulation of the sweet cortical field (referred to as on-trials). 
Importantly, the experiment was set up such that the laser shutter was 
under contact-licking operation, so the animal had control of its own 
stimulation during the on-trials, and therefore only persistent licking 
(self-stimulation) would continue to activate the sweet cortex. Our 
results demonstrate that animals aggressively self-stimulated during 
on-trial sessions, with ChR2 activation of the sweet cortical field radi- 
cally increasing licking behaviour, even though the spout still delivered 
only water, as in the off-trials (Fig. 2b, d). 

Just as a lot of sugar can ‘mask a bitter tastant, we hypothesized that 
strong activation of the cortical field representing sweet taste might be 
capable of overcoming the natural aversion to an orally applied bitter 


1Howard Hughes Medical Institute, Columbia College of Physicians and Surgeons, Columbia University, New York, New York 10032, USA. *Departments of Biochemistry and Molecular Biophysics, 
Columbia College of Physicians and Surgeons, Columbia University, New York, New York 10032, USA. 3Department of Neuroscience, Columbia College of Physicians and Surgeons, Columbia 
University, New York, New York 10032, USA. 4HHMI/Janelia Farm Research Campus, 19700 Helix Drive, Ashburn, Virginia 20147, USA. 5National Institute of Dental and Craniofacial Research, 


National Institutes of Health, Bethesda, Maryland 20892, USA. 


512 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


13 
a x 0.4 
4 =e 
61 2 0.0 
et 5 
8 & -0.2 
Bi & 0.4 
Id Pre Che Cy. 
Vas, Oe 5 
g 
e 
x 0.4 
oO 
x 2 02 
o 
E 5 
5 @ -0.2 
2-04 
Bk. CG Cp 
"8.5 mgm be 
Nip, vy ee 
Chamber 1 Chamber 2 a 


Figure 1 | Place preference by photostimulation of the sweet and bitter 
cortical fields. a, Sample injection of reporters in stereotactic coordinates 
defining the sweet and bitter cortical fields. Top: sweet cortex labelled with 
AAV-GFP and bitter cortex with AAV-TdTomato; bottom: a horizontal 
section. See Extended Data Fig. 1 for additional data. b, Coronal section 

of a mouse brain (bregma —0.2) stained with TO-PRO-3 (blue). Shown is 

a representative histological sample of the bitter cortical field expressing 
ChR2 fused to yellow fluorescent protein (ChR2-YFP), illustrating the 
location and trajectory (dotted lines) of the implanted guide cannula; 

IC, insular cortex. c, In vivo recording of ChR2-expressing insular cortical 
neurons in response to light stimulation (ten pulses, 10 Hz). The expanded 
traces show responses to each light pulse (blue bars below the trace). d, Left: 
representative tracking of a mouse during the 5 min preference test in a 
two-chamber arena; chamber 1 was coupled to light stimulation of the sweet 
cortical field during the training sessions. Shown are the fractions of time 
spent in each chamber. Right: quantitation of preference index before (pre-) 
and after (chamber 1) training with photostimulation of the sweet cortical 
field (n = 13 animals; Mann-Whitney U-test, P<0.003). Preference can be 
readily reversed by light stimulation in the opposing side (chamber 2, n= 6; 
P<0.02). e, Representative mouse track and quantitation of preference 
index in mice expressing ChR2 in the bitter cortical field; note significant 
aversion to the chamber coupled to photostimulation (chamber 1, n= 15; 
Mann-Whitney U-test, P<0.005); this behavioural aversion can be switched 
to the opposite chamber by re-exposure to photostimulation in chamber 2 
(n=4; P<0.03). Values are mean +s.e.m. See Extended Data Fig. 2b for 
GFP control injections. 


stimulus. Therefore, we asked whether photostimulation of the sweet 
cortical field in animals expressing ChR2 in sweet cortex could switch 
preference for an otherwise aversive tastant. Conversely, we also tested 
whether photostimulation of the bitter cortical field triggers aversion to 
an otherwise sweet, attractive tasting chemical. Our results (Extended 


LETTER 


Trial number 
a" 
ine} 
= 
ine} 

Trial number 
ar 
ine} 


20 20 20 
8 20 40 
Licks 
c d 
n no 
wo Lo 
o 0) 
a (ou 
n n 
S s 
aa) 
Water (MVE\rers Water [WEice 
light light 

i TRPMS knockout 30 — TRPMS knockout 
a 30-7 o 
re) 0 20- 
: z 
g 20 4 Q 

[S) 
a 3 104 

10-4 
0 +~__—_____,— 0 +__,_—___,—— 


Water [eas Water [Ama 
light light 


Figure 2 | Photostimulation of bitter and sweet cortical fields drives 
aversive and appetitive behaviours. a, b, Representative raster plots (left) 
and histograms (right) illustrating licking events during a 5s licking window 
in the presence (blue) or absence (open) of light stimulation of (a) the bitter 
and (b) the sweet cortical fields. The purple line at time zero indicates the 
start of each trial; the green line indicates the onset of water delivery. 

c, d, Quantitation of licking responses with and without light stimulation in 
(c) the bitter cortical field (n = 34, Mann-Whitney U-test, P<4 x 10~”) or 
(d) sweet cortical field (n = 31, Mann-Whitney U-test, P<5 x 10~°) of 
wild-type mice. e, f, Quantitation of licking responses in TRPM5 knockout 
mice (e, bitter cortical fields, n =9, Mann-Whitney U-test, P<5 x 10>; 

f, sweet cortical fields, n = 10, Mann-Whitney U-test, P=0.001). Each point 
indicates data from an individual mouse before and after photostimulation. 


Data Fig. 3) show both postulates to be correct, and highlight how 
activation of selective taste cortical fields can mask the hedonic value 
of oral taste stimulation. 

The experiments described above show that direct control of primary 
taste cortex can evoke specific, reliable, and robust behaviours naturally 
symbolic of taste responses to chemical tastants. These gain-of-function 
studies also illustrate how top-down control of the taste pathway can 
activate innate, immediate responses representing sweet and bitter taste. 

To formally demonstrate that these cortically triggered behaviours are 
innate (that is, independent of learning or experience) we performed 
similar stimulation experiments in mice that had never tasted sweet or 
bitter chemicals (TRPM5 null mice!; Extended Data Fig. 4). Indeed, 
our results (Fig. 2e, f) showed that even in animals that had never expe- 
rienced sweet or bitter taste, ChR2 activation of the corresponding 
cortical fields still triggered the appropriate behavioural response, thus 
substantiating the predetermined nature of the sense of taste. 

It has been known for a long time that decerebrated animals can 
still exhibit stereotyped attraction and aversion to sweet and bitter 
chemicals!’. This is thought to be mediated by brainstem taste 
circuits dedicated to immediate responses''’”. Therefore, to evaluate 
the necessity (and sufficiency, see next section) of taste cortex in taste 


26 NOVEMBER 2015 | VOL 527 | NATURE | 513 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Correct 


4 Reward/air. putt G Lick —> Water reward 
H Ed i fo) 
Bei i Ze a 8 i rk __ Wrong ; 
tial” > OUTILIIIHILLLIIN intertrial interval ine eee 
Wrong 
—> Aijr puff 


Ci 
Spout 1 $-—_—_@§$_ ————| Lick position oe 
position 4) Rast 


Correct 


Lick 


Cue Tone One dry lick to trigger action No lick — > No action 
b 5 . c Testing 
raining/testin Testi 
1) 9 esting 100 
Quinine AceK CYX Sucrose 80 
nel 100% 10% 100% 0% 
5 = 60 
3 8 8 8 8 & 
3 & 40 
84 4 4 4 
a 20 
= 
6 
4 0 
0 5 10 5 10 5 10 5 10 Q. 4 iy} 
Trial number = Trialnumber Trial number Trial number %~y, Cap by “Gp 


Figure 3 | Go/no-go taste discrimination task in head-restrained mice. 
a, Schematic and flow chart of the go/no-go taste discrimination task. Each 
trial starts with a visual cue (purple line), followed 1s later by a tone (green 
line) to alert mice to initiate licking. After sampling, mice were given 3s 

to continue to lick (go) or withhold licking (no-go) in response to the test 
tastant. For go trials, mice were rewarded with water (3s) if they chose to 
lick within the 3-s interval. For no-go trials, mice received a mild air puff 
to the eyelid if they failed to withhold licking. After the reward/penalty 
phase, the spout retracted and was cleared for the next trial; inter-trial 
intervals were 8s. b, Representative histograms illustrating recognition 
and generalization within bitters and sweets. This animal was trained 

and tested with 4mM AceK (sweet no-go) and 0.5mM quinine (bitter 

go), and then assayed with 100 mM sucrose and 10 uM cycloheximide 
(CYX). c, Quantitation in nine animals, demonstrating highly reliable taste 
recognition and discrimination. Values are mean + s.e.m. 


recognition and discrimination, we needed to design a test that bypasses 
immediate taste responses, and instead engages cortical circuits. In this 
assay (go or no-go behavioural test)'*"*, thirsty animals were trained to 
sample a test tastant from a spout, and then to report its identity either 
by licking (go) or withholding licking (no-go) (Fig. 3). This learned 
behaviour required the animal to sample the cue, recognize the tastant, 
and execute the appropriate behaviour in each trial. We trained ani- 
mals several ways, including to go to bitter and no-go to sweet, exactly 
the opposite of the innate drive. After 10-15 sessions of training (each 
consisting of 80 trials, with 40 randomly presented sweet and 40 bitter 
cues), mice were able to report the tastant’s identity with almost 90% 
accuracy (Fig. 3). To further demonstrate the selectivity of the assay and 
responses, we next tested the animals with sweet and bitter chemicals 
not used in the training phase. Given that all sweet tastants activate 
the same sweet taste receptor!>"!”, and all bitters the same class of taste 
receptor cells!8, we expected that novel sweets should also be recog- 
nized as no-go cues, whereas novel bitters should be seen as go cues. 
Indeed, animals trained with the bitter tastant quinine and the artificial 
sweetener acesulfame K (AceK) recognized and responded with similar 
accuracy to cycloheximide and sucrose, bitter and sweet tastants with 
completely different chemical structures from the training set (Fig. 3c). 

We implanted cannulae bilaterally into the bitter cortical fields of 
trained animals (Supplementary Table 1), waited 2 weeks for recovery, 
and assayed tastant discrimination in the go/no-go behavioural test 
before and after bilateral injection of a glutamate receptor antagonist 
(NBQX) to silence cortical activity'®’°. As shown in Fig. 4, silencing 
the bitter cortical fields prevented animals from reliably identifying the 
bitter tastant (see Extended Data Fig. 5 for additional examples using 
the reverse training test). In contrast, their ability to recognize sweet 
tastants remained unimpaired. Importantly, the loss of bitter taste func- 
tion was fully reversible upon washout of the drug (Fig. 4a), whereas 
injection of a saline control in the bitter cortical fields had no significant 
effect on either bitter or sweet taste sensing (Fig. 4b). We used the same 
strategy to conduct loss-of-function experiments in the sweet cortex. 
Indeed, bilateral silencing of the sweet cortical fields disrupted sweet, 
but not bitter, taste discrimination (Fig. 4c, d). As expected, animals 


514 | NATURE | VOL 527 | 26 NOVEMBER 2015 


a b Saline (bitter cortex) 
2 1S 4) Q ctatsesey “Sat 
£ g + 
8 8 
= = 
G 6 
E E 
& £ OB inde tustencientenes 
a a 
Bitter: no-go ; 
Sweet Bitter Bitter Sweet: go Sweet Bitter 
Tastants Tastants 
d Saline (sweet cortex) 
2 2 
§ = 1.0 | =< 
[oxy 
6 

E E 
ie) Ob OB cthensssvnccosannsgeessstesenes 2s 
c ~€ 0. 
a a 

: . ——_ =. — 

Bitter Sweet Sweet BInGE ge Bitter Sweet 
Sweet: no-go 
Tastants Tastants 


Figure 4 | Inactivation of the bitter and sweet cortical fields disrupts 
taste discrimination. a, Quantitation of performance ratios (see methods) 
before and after bilateral silencing (NBQX, 5 mg ml’) of the bitter cortical 
fields (n= 8); animals were trained to no-go to bitter and go to sweet. Note 
the impact in bitter taste discrimination, but no significant effect in sweet 
taste. After washout of the drug, the animal's ability to recognize bitter is 
restored. Comparable results are obtained when animals are instead trained 
to go to bitter and no-go to sweet (Extended Data Fig. 2). b, Quantitation 

of performance ratios with saline controls in bitter cortical fields; there is 

no significant effect on sweet or bitter taste (n =5; Mann-Whitney U-test, 
P=0.14). c, Quantitation of performance ratios with bilateral injection of 
NBQkX in the sweet cortical fields (n = 8). Animals were trained to no-go to 
sweet and go to bitter; note significant deficit in sweet taste, but no effect on 
bitter taste. After washout of the drug, the animal's ability to recognize sweet 
is restored. d, Saline injections in the sweet cortical fields have no significant 
effect on bitter or sweet taste (n =7; Mann-Whitney U-test, P= 0.80). 
Values are mean + s.e.m. Mann-Whitney U-test, **P<0.01, ***P<0.001. 


recovered sweet taste perception after drug washout. Taken together, 
these results substantiate the essential role of the sweet and bitter cor- 
tical fields in sweet and bitter taste recognition. 

What is the mouse sensing upon direct activation of a taste cortical 
field? Does optogenetic stimulation create internal representations that 
mimic those evoked by sweet and bitter chemicals on the tongue? If so, 
we reasoned that animals trained to recognize and report the sensory 
features of an orally provided sweet or bitter tastant (for example, in a 
go/no-go assay) should respond similarly to optogenetic stimulation 
of the corresponding cortical fields, even though the animal had never 
been trained with light stimulation. In essence, iflight and the chemical 
tastant evoke similar percepts, then light will generalize to the learned 
responses associated with the orally supplied stimulus. 

We first focused on sweet, because activation of the bitter cortical 
field evokes prototypical and highly salient orofacial responses that are 
already strongly indicative of bitter perception (Supplementary Video 1). 
We introduced ChR2 into the sweet cortical field of untrained mice 
and validated robust light-triggered appetitive responses (see Fig. 2). 
Then, the mice were trained in a go/no-go behavioural test where they 
learned to associate go with a bitter chemical and a low-salt solution 
(Fig. 5a), and no-go with sweet taste. Critically, under this test, mice 
needed to report both an aversive (bitter) and an attractive cue (low 
salt, see also Extended Data Fig. 6) in the same arm of the behavioural 


© 2015 Macmillan Publishers Limited. All rights reserved 


a Training 
25 Quinine AceK Salt 
5 90% 10% 90% 
{3} 
a 8 8 8 
a 
e4 | 4 4 } 
2 
— 0 5 100 10200 5 10 
Trial number 
Testing 
> Quinine = AceK Salt ele: 
8 100% 0% 100% 20% : 
a 8 8 8 8 
g 
et 4 4 4 
2 
— 0 5 100 10200 5 100 5 10 9 i 
Trial number Sara G, Se 


Figure 5 | Cross-generalization between orally supplied taste stimuli 

and photostimulation of the sweet cortex. a, Representative histograms 
illustrating mouse performance during a training session in the go/no-go 
discrimination task. The mouse was trained to go to bitter (0.5 mM quinine) 
and low salt (20 mM NaCl), and no-go to sweet (4mM AceK). Note that 
both bitter (aversive) and low salt (attractive) were used in the same branch 
of the behavioural task (go) to exclude the valence as an identifier. 

b, Left: representative histograms illustrating cross-generalization between 
taste stimulation and photostimulation of the sweet cortical field. Right: 
quantitation of the responses from individual animals to quinine, AceK, salt 
and salt + light (n= 8, Mann-Whitney U-test, P <0.0002). 


test, hence removing pure valence”! as a way of identifying tastants. 
After mice performed at or above 80% accuracy (Fig. 5a), we assayed 
whether light (previously triggering strong appetitive responses) was 
being sensed and reported as sweet (now a no-go response). Animals 
were tested with 50 randomized trials consisting of 20 bitter, 10 sweet, 
10 low salt, and 10 low salt linked to light stimulation of the sweet cor- 
tical field. Our results (Fig. 5b) showed that light stimulation of sweet 
cortex was indeed being sensed as a ‘fictive’ sweet stimulus, eliciting 
strong and reliable no-go responses; Extended Data Fig. 7 shows similar 
experiments and equivalent findings with bitter cortex. Taken together, 
these results show that activation of a taste cortical field recapitulates 
an internal representation (for example, perceptual quality) naturally 
indicative of the orally presented chemical. 

The essential role of the sense of taste is to evaluate the quality of 
a food source or a meal, and to activate the appropriate behavioural 
actions to consume or reject ingestion!. The taste cortex is thought to 
represent the basic sensory features of the different taste qualities’””?, 
and to function as a central neural ‘hub’ that informs and integrates 
with other brain areas, and the internal state, to guide taste-dependent 
actions. 

This work centred on the study of the two most distinctive taste 
qualities, sweet and bitter. These two differ not only in quality but 
also in valence, mediating innately attractive and aversive behaviours. 
Many studies have used optogenetics to activate ensembles of neurons 
and examine their physiological and behavioural consequences®**~””. 
In this work we explored the internal representation of arguably the 
two most recognizable chemosensory percepts. Our current studies 
demonstrate that it is possible to govern an animal's perception and 
behavioural responses by direct manipulation of selective taste cortical 
fields. Notably, unlike our other fundamental chemical sense (smell), 
activation of the sweet and bitter cortical fields evokes predetermined 
behavioural programs, independent of learning and experience, further 
illustrating the hardwired and innate nature of the sense of taste. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


LETTER 


Received 7 April; accepted 29 September 2015. 
Published online 18 November 2015. 


1. Lindemann, B. Receptors and transduction in taste. Nature 413, 219-225 
(2001). 

2. Yamamoto, T. Taste responses of cortical neurons. Prog. Neurobiol. 23, 
273-315 (1984). 

3. Accolla, R., Bathellier, B., Petersen, C. C. & Carleton, A. Differential spatial 
representation of taste modalities in the rat gustatory cortex. J. Neurosci. 27, 
1396-1404 (2007). 

4. Chen, X., Gabitto, M., Peng, Y., Ryba, N. J. & Zuker, C. S. A gustotopic map of 
taste qualities in the mammalian brain. Science 333, 1262-1266 (2011). 

5. Boyden, E. S., Zhang, F., Bamberg, E., Nagel, G. & Deisseroth, K. Millisecond- 
timescale, genetically targeted optical control of neural activity. Nature 
Neurosci. 8, 1263-1268 (2005). 

6. Lammel, S. et al. Input-specific control of reward and aversion in the ventral 
tegmental area. Nature 491, 212-217 (2012). 

7. Halpern, B. P. in Drinking Behavior (eds Weijnen, J. A. W. M. & Mendelson, J.) 
1-92 (Springer, 1977). 

8. Guo, Z. V. et al. Procedures for behavioral experiments in head-fixed mice. PLoS 
ONE 9, €88678 (2014). 

9. Grill, H. J. & Norgren, R. The taste reactivity test. |. Mimetic responses to 
gustatory stimuli in neurologically normal rats. Brain Res. 143, 263-279 
(1978). 

10. Zhang, Y. et al. Coding of sweet, bitter, and umami tastes: different receptor 
cells sharing similar signaling pathways. Ce// 112, 293-301 (2003). 

11. Grill, H. J. & Norgren, R. The taste reactivity test. Il. Mimetic responses to 
gustatory stimuli in chronic thalamic and chronic decerebrate rats. Brain Res. 
143, 281-297 (1978). 

12. Reilly, S. & Pritchard, T. C. Gustatory thalamus lesions in the rat: |. Innate taste 
preferences and aversions. Behav. Neurosci. 110, 737-745 (1996). 

13. Gardner, M. P. & Fontanini, A. Encoding and tracking of outcome-specific 
expectancy in the gustatory cortex of alert rats. J. Neurosci. 34, 13000-13017 
(2014). 

14. Graham, D. M., Sun, C. & Hill, D. L. Temporal signatures of taste quality driven 
by active sensing. J. Neurosci. 34, 7398-7411 (2014). 

15. Li, X. et al. Human receptors for sweet and umami taste. Proc. Nat! Acad. 

Sci. USA 99, 4692-4696 (2002). 

16. Nelson, G. et al. Mammalian sweet taste receptors. Ce// 106, 381-390 (2001). 

17. Zhao, G. Q. et al. The receptors for mammalian sweet and umami taste. Cel! 
115, 255-266 (2003). 

18. Mueller, K. L. et a/. The receptors and coding logic for bitter taste. Nature 434, 
225-229 (2005). 

19. Calu, D. J., Roesch, M. R., Haney, R. Z., Holland, P. C. & Schoenbaum, G. Neural 
correlates of variations in event processing during learning in central nucleus 
of amygdala. Neuron 68, 991-1001 (2010). 

20. Tye, K. M. et al. Amygdala circuitry mediating reversible and bidirectional 
control of anxiety. Nature 471, 358-362 (2011). 

21. Small, D. M. et al. Dissociation of neural representation of intensity and 
affective valuation in human gustation. Neuron 39, 701-711 (2003). 

22. Spector, A. C. & Travers, S. P. The representation of taste quality in the 
mammalian nervous system. Behav. Cogn. Neurosci. Rev. 4, 143-191 (2005). 

23. Simon, S.A. de Araujo, |. E., Gutierrez, R. & Nicolelis, M. A. The neural 
mechanisms of gustation: a distributed processing code. Nature Rev. Neurosci. 
7, 890-901 (2006). 

24. Witten, |. B. et a/. Cholinergic interneurons control local circuit activity and 
cocaine conditioning. Science 330, 1677-1681 (2010). 

25. Choi, G. B. et a/. Driving opposing behaviors with ensembles of piriform 
neurons. Cell 146, 1004-1015 (2011). 

26. Atasoy, D., Betley, J. N., Su, H. H. & Sternson, S. M. Deconstruction of a neural 
circuit for hunger. Nature 488, 172-177 (2012). 

27. Nieh, E. H. et a/. Decoding neural circuits that control compulsive sucrose 
seeking. Cell 160, 528-541 (2015). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We particularly thank H. Fischman and R. Lessard for 
suggestions, and members of the Zuker laboratory for comments. We also thank 
D. Salzman, K. Scott, and R. Axel for discussions. This research was supported 

in part by a grant from the National Institute of Drug Abuse (DA035025) to 
C.S.Z., and the Intramural Research Program of the National Institutes of Health, 
National Institute of Dental and Craniofacial Research (to N.J.P.R.). C.S.Z. is an 
investigator of the Howard Hughes Medical Institute and a Senior Fellow at 
Janelia Farms Research Campus, Howard Hughes Medical Institute. 


Author Contributions Y.P. designed the study, performed experiments, and 
analysed data; S.G.-S. performed animals studies, viral injections, histology 
and analysed data; HJ. performed c-Fos expression studies; D.T. developed the 
initial behavioural platforms; N.J.P.R. and C.S.Z. designed the study, analysed 
data, and together with Y.P. wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to C.S.Z. 
(cz2195@cumc.columbia.edu) or N.J.P.R. (nick.ryba@nih.gov). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 515 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Stereotaxic injections and anatomy. All procedures were performed according to 
the approved protocols at Columbia University. Six- to eight-week-old C57BL6/J 
and Trpm5~‘~ mice were used for viral injections. All surgeries were performed 
using aseptic technique. Mice were anaesthetized with ketamine and xylazine 
(100 mg/kg body weight and 10 mg/kg body weight, intraperitoneal), placed into 
a stereotaxic frame, and unilaterally injected with ~30 nl AAV carrying ChR2 
(AAV9.CamKIla.hChR2(H134R)-EYFP.WPRE.SV40, Penn Vector Core) either 
in the sweet cortical field (bregma 1.6 mm; lateral 3.1 mm; ventral 1.8 mm), or 
the bitter cortical field (bregma —0.3 mm; lateral 4.2 mm; ventral 2.8 mm). After 
viral injection, a guide cannula (26 gauge, PlasticsOne) or a customized implant- 
able fibre (200 j1m, numerical aperture = 0.39) was implanted 300-500 1m above 
the injection site, and fixed in place with dental cement. A metal head-post was 
also attached and secured with dental cement for the purpose of head fixation 
during behavioural experiments. For pharmacological experiments, AAV-ChR2 
was injected bilaterally in the sweet or bitter cortical fields, followed by bilateral 
implantation of guide cannulae. Mice were allowed to recover for 2-3 weeks before 
the start of behavioural experiments. Placements of viral injections, guide can- 
nulae, and implantable fibres were histologically verified at the termination of 
the experiments by TO-PRO3 (1:1,000, Invitrogen) staining of coronal sections 
(100 1m). Fluorescent images were acquired using a confocal microscope (FV 1000, 
Olympus). 

Animals. All behavioural experiments with wild-type animals used 6- to 8-week- 
old male C57BL6/J mice. No statistical methods were used to predetermine sample 
size, and investigators were not blinded to group allocation. No method of random- 
ization was used to determine how animals were allocated to experimental groups. 
In vivo recordings. Mice expressing ChR2 in taste cortex were anaesthetized with 
urethane (1.8 mg/g body weight), and the insular cortex was exposed as previously 
described‘, Extracellular neural activity was recorded using a tungsten electrode 
(resistance 2.0-4.0 MQ, FHC). Data were acquired, amplified, digitized, and 
bandpass filtered at 600-6,000 Hz with a Neuralynx data acquisition system. For 
photostimulation, 10 Hz, 5-ms pulses of 473 nm light (~5 mW) were delivered via 
a solid-state laser (Shanghai Laser & Optics Century Co.) coupled to an optical 
fibre (200|1m) positioned above the insular cortex. 

c-Fos induction and Immunohistochemistry. Individual mice were implanted 
with an intraoral cannula”® 3 days before c-Fos induction. On the day of experi- 
ments, mice were anaesthetized with urethane (1.6 mg/g body weight) and the 
trachea was cannulated to aid breathing during oral stimulus presentation. Tastants 
were perfused into the mouth through the intraoral cannula for 1.5h at a rate of 
~6mlh7!. Mice were allowed to rest for 30 min and processed for immunostain- 
ing as previously described. The brains were sectioned coronally at 100j1m, and 
labelled with goat anti-c-Fos (Santa Cruz, sc-52-G) overnight; Alexa 488 donkey 
anti-goat or cy3 donkey anti-goat (Jackson immunoResearch) were used to visu- 
alize c-Fos expression. All images were taken using an Olympus FluoView 1000 
confocal microscope. 

Place preference assays. Individual mice were tested in a custom-built two- 
chamber arena (30cm x 30cm total size). To differentiate the chambers, one cham- 
ber was designed with alternating black and white vertical stripes on its walls, 
whereas the other chamber was uniformly black. The arena was contained within 
a sound-attenuating cubicle (Med Associates). Mice were trained in the arena for 
30 min with photostimulation of the sweet or bitter cortical field, and tested in the 
absence of any light stimulation for 5 min at the end of each session (defined as 
‘preference test’). Animal locations were tracked in real time by video imaging. At 
the beginning of the experiments, mice were acclimated to the arena for one session 
without light stimulation (defined as the pre-test condition). Photostimulation 
sessions began the next day, with two daily sessions for about 1 week. For each 
mouse, one chamber was randomly selected for photostimulation (chamber 1); 
when a mouse was located in this chamber, light was delivered (20 Hz, 20-ms 
pulses, 5-10 mW) for 5-s intervals, with 5-s rest periods to avoid over-stimulation 
or phototoxicity. After 1 week of sessions, a ‘reverse probe’ study was performed in 
a subset of animals, during which photostimulation was delivered in the opposing 
chamber (chamber 2). Animals were trained for a minimum of eight sessions, and 
the preference tests from the last three sessions were used to calculate the prefer- 
ence index (PI); PI= (tf; — t2)/(ti + b), where t; is the fractional time a mouse spent 
in the chamber 1, and tf, is the time spent in chamber 2. 

Lick preference assays. Mice were first water-deprived for 24h to motivate drink- 
ing behaviour. They were then introduced to head restraint and acclimated to 
drinking from a motor-positioned spout in 60-trial sessions (15 min), twice a day 
for 3 days. Each trial began with a flash, followed 1s later by the spout swing- 
ing into position and a tone (4 kHz) to indicate the onset of water delivery. The 
spout remained in position for 5s and was then removed. Mice were weighed daily 


during the habituation period as well as during any behavioural tests requiring 
water restriction. Additional water was supplied as necessary to ensure that ani- 
mals maintained at least 85% of their initial body weight. To measure attractive/ 
appetitive responses, mice were mildly water restrained (exhibiting an average 
of not more than 15 licks per 5-s trial in the lick preference assay), and supplied 
with approximately 511 water during each trial. To measure aversion, mice were 
water-deprived for 24h, and supplied with approximately 101] water distributed 
over the full 5s of spout presentation for each trial (so that animals remained 
eager to lick for all 5s). To ensure animals were appropriately motivated in the 
lick preference behavioural assays (that is, thirsty to examine lick suppression, 
and mildly satiated to examine attraction), we examined animals exhibiting an 
average of at least 20 licks per 5s trial as an indicator of ‘thirst, and not more than 
15 licks per 5s trial for mild satiation. Animals were recorded by video for the 
entire session, and licks were analysed and counted by custom-written MATLAB 
software (Mathworks). Light stimulation and water delivery were controlled by 
the same software via an Arduino board. All animals analysed in these studies had 
histologically confirmed expression of ChR2 in the sweet or bitter cortical fields 
(Supplementary Table 1). 

Go/no-go taste discrimination behaviour. Mice deprived of water for 24h were 
first acclimated to consuming water in a head-restrained position for 15-min 
sessions over 2-3 days. Animals were then trained to perform a taste discrimi- 
nation task, in which they were to lick, and receive a water reward, in response 
to a 2-11 presentation of tastant-1 (‘go’) and to withhold licking in response to 
tastant-2 (‘no-go’). The presentation of the go and no-go stimuli was randomized. 
Each trial began with a visual cue (100-ms light flash), followed 1s later by a tone 
(4kHz, 300 ms) alerting the animal to sample the test tastant (for example, AceK or 
quinine; ~2 1] per sample). After sampling, mice were given 3s either to continue 
to lick the spout (go trial) or to withhold licking (no-go trials). On go trials, ifa 
mouse chose to lick within the 3-s interval, it was then rewarded with water for 
3s. On no-go trials, if a mouse failed to withhold licking within the 3-s interval, 
it was given a penalty of a gentle air puff to the eyelid. Mice were trained for two 
sessions per day, with 80 randomized trials (20 min) per session. For analysis, a ‘go 
response was defined as four or more licks in the second before reward or penalty. 
For photostimulation experiments, mice were first trained until they could effec- 
tively discriminate the tastants with ~90% accuracy (over 1-2 weeks). Then, on the 
‘probe’ sessions, tastants and/or cortical photostimulation were presented during 
the sample period. Neither reward nor punishment was delivered for novel tastants 
or light stimulation. Before testing, animals with correctly placed cannulae were 
provisionally identified by ChR2 expression followed by one or two sessions of lick 
preference pre-tests. All animals analysed had histologically confirmed placement 
of cannulae and expression of ChR2 in the appropriate cortical field. 
Pharmacological inhibition. Mice were trained to discriminate sweet from bitter 
in the go/no-go task with at least 90% accuracy. On the day of the experiment, 
mice were first tested with four taste stimuli (pre-test), including the original 
training tastants (2mM AceK and 0.1 mM quinine) and a novel sweet and bitter 
tastant (50mM sucrose and 21M cycloheximide). After the test, 0.3 11 of the gluta- 
mate receptor antagonist NBQX (5 mg ml~! in 0.9% NaCl, Tocris Bioscience) was 
bilaterally infused into the chosen insular cortical fields over a period of 3 min. 
NBQX was delivered via an internal infusion needle inserted into the same guide 
cannulae used for light stimulation and connected to a 1-1] Hamilton syringe 
(PlasticsOne). Saline (0.9% NaCl) was used as control. After NBQX or saline 
infusion, animals were placed in their home cages to rest for 1.5h. Mice were then 
re-tested with the same four taste stimuli on the go/no-go task (NBQX-test) and 
then at 8-24h after rest (recovery-test). During tests, a water reward was given 
for correctly identifying the go cue, but no air puff was delivered for incorrectly 
identifying the no-go cue (to avoid possible re-learning). No reward or punish- 
ment was applied for the novel sweet and bitter tastants. A performance ratio 
was calculated for each taste quality: ratio= 1)/r2, where rj is the percentage of 
correct responses during the NBQX-test or recovery-test, and 1 is the percentage 
of correct responses during the pre-test. The percentage of correct responses for 
each taste quality was the average of go (%) for go taste stimuli (for example, 
quinine and cycloheximide), or the difference between (100 — go (%)) for no-go 
stimuli (for example, AceK and sucrose). All animals analysed had anatomically 
confirmed placement of cannulae in the appropriate cortical field. We note that we 
made several unsuccessful attempts to optogenetically silence the sweet and bitter 
cortical fields; this may have been due in part to the requirement for expression 
in most, if not all, relevant neurons. 


28. Tokita, K., Armstrong, W. E., St John, S. J. & Boughter, J. D. Jr. Activation of 
lateral hypothalamus-projecting parabrachial neurons by intraorally delivered 
gustatory stimuli. Front. Neural Circuits 8, 86 (2014). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Sweet cortex 


Bitter cortex 


Quinine 
Wild type 


Quinine 
TRPM5 ko 


Light stimulation 


LETTER 


No light 


-0.2 0 


Extended Data Figure 1 | Expression of ChR2 in taste cortex. a, Samples 
of injection sites in the bitter and sweet cortical fields; shown are coronal 
sections (Fig. 1a shows a whole mount brain). ChR2-YFP expression 
(green), nuclei (blue; TO-PRO-3); numbers indicate position relative to 
bregma, and the dotted area highlight the location of the taste cortical 
fields (see c). b, Activation of insular neurons in sweet cortex triggers 
robust c-Fos expression; ChR2-YFP (green), c-Fos (red) after 10 min of 
in vivo photostimulation at 20 Hz, 20-ms pulses (5s laser on, 5s laser off, 

5 mW). Dashed lines indicate the location of the stimulating cannulae/fibre. 


Bregma 


+0.7 +1.5 

c, c-Fos (red) expression in bitter cortex (bregma 0, —0.2) after bitter 
tastant stimulation (10 mM quinine; see Methods for details). Note the 
absence of c-Fos expression in the middle (bregma +0.7) and sweet insular 
cortex (bregma +1.5). Importantly, specific labelling is abolished in taste 
blind animals (TRPM5 knockouts; middle row). The bottom row shows a 
diagram of the corresponding brain areas, adapted from the Allen Brain 
Atlas. Scale bars: 1 mm (a), 500 1m (b), 300 1m (c). PIR, piriform cortex; 
IC, insular cortex. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a e Sweet cortex 
e Bitter cortex 


x< 
oO) 
xe) 
— 
® 
Oo 
Cc 
= 
£ 
2 
Oo 
0 2 4 6 8 10 
Session# 
I 
I x< 
' ® 0.4 
2 
I Oe 
7) O 
=| < 0.0 
—_ 
= 
QO! @ -0.2 
O1 o 
= 
& -0.4 
I 
QL, 
] Te. Che " 
Oe, 
Seles Chamber 2 7 
Extended Data Figure 2 | Acquisition of Place preference. a, The used in Fig. 1. Values are mean +s.e.m. b, Representative mouse track and 
development of ‘place preference’ as a function of session number (each quantitation of preference index in control GFP-expressing mice; note no 
session was 30 min of training and 5 min of ‘after-training’ testing in the difference in preference between chambers (nm = 14; Mann-Whitney 
absence of light stimulation; n = 13 for sweet cortex, n= 15 for bitter U-test, P=0.74). Values are mean + s.e.m. 


cortex; see text and Methods for details). The average of sessions 6-8 was 


© 2015 Macmillan Publishers Limited. All rights reserved 


ChR2 in bitter cortex 


P=0.015 P=0.0002 


Licks/5s 


Sweet+light 


Water Sweet 


Extended Data Figure 3 | Photostimulation of insular cortical fields 
overcomes natural taste valence. a, Quantitation of licking responses 
in mice expressing ChR2 in the bitter cortical fields (n = 13, analysis of 
variance (ANOVA) test, Tukey’s honest significant difference post hoc 
test). Photostimulation of the bitter cortical fields significantly suppress 
the natural attraction of the sweet tastant (4mM AceK). b, Quantitation 
of licking responses in mice expressing ChR2 in the sweet cortical fields 


LETTER 


ChR2 in sweet cortex 


P=0.013 P=0.036 


Licks/5s 


(n= 14, ANOVA test, Tukey’s honest significant difference post hoc test). 
Photostimulation of the sweet cortical fields significantly overcomes the 
natural aversion of the bitter tastant (1 mM quinine). In both experiments, 
mice were water-restrained (but exhibited an average of not more than 

30 licks per 5-s water trial) such that they were motivated to drink the 
bitter while showing attraction to sweet. Values are mean +s.e.m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


TRPM5~ 


Licks/5s 
NO oO & 
(oe) (oe) ie) 


=" 
oO 


Extended Data Figure 4 | TRPM5 knockout mice do not taste sweet observed between water and sweet/bitter tastants in TRPM5 knockouts 
and bitter. Taste preference was tested in the head-restrained assay for (ANOVA test, P= 0.62, n= 10; see ref. 10 for more details); circles indicate 
wild type and TRPM5 homozygous mutants. Tastants were randomly individual animals; bar graphs show mean +s.e.m. 


delivered for a 5-s window (ten trials each). No significant difference was 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Saline (bitter cortex) 


= ro) 

ea eS 

e i g 

@ 1.0 4." g 1.0 

ic c 

© © 

3 05 5 05 

. (n=7) 

Sweet Bitter Bitter Bitter: Go Sweet Bitter 
Sweet: No-Go 
Tastants Tastants 

Extended Data Figure 5 | Inactivation of the bitter cortical fields in (Mann-Whitney U-test, P<0.005). b, Quantitation of performance 
animals trained to go to bitter and no-go to sweet. a, Quantitation of ratios with saline (0.9%) control in the bitter cortical fields (n=6, 
performance ratios before and after bilateral silencing of the bitter cortical | Mann-Whitney U-test, P= 0.56). In both experiments, mice were trained 
fields (NBQX, 5 mg ml~!; n=7) in animals trained to go to bitter and with quinine and AceK, and tested with two pairs of sweet/bitter tastants 
no-go to sweet. Note the impact in bitter taste discrimination, but no (0.1 mM quinine and 2mM AceK, 2 uM cycloheximide and 50 mM 
significant effect in sweet taste (Mann-Whitney U-test, P<0.002). After sucrose; see Methods for details). 


washout of the drug, the animal’s ability to recognize bitter is restored 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


fv 
=) 


P<0.001 


Licks/5s 
NO oO 
(oe) © 


=" 
oO 


Extended Data Figure 6 | Sweet and low salt are appetitive tastants. strong aversion to bitter (n = 11, ANOVA test, Tukey’s honest significant 
Taste preference was tested during a 10-min window using the head- difference post hoc test); circles indicate individual animals; bar graphs 
restrained assay (see Methods for details). Four tastants were randomly show mean + s.e.m. These conditions were used in the experiments 
delivered to animals for 5s each (ten trials per tastant). Note that animals described in Fig. 5 and Extended Data Fig. 7. 


show significant attraction to sweet (AceK) and low salt (NaCl), but 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
quinine AceK salt 100 g 
100% 0% 0% 70% 80 
3 8 8 se 60 
g ; ; ° 40 
i 20 : 
@) ° 
0 10 200 10 20 5 100 5 10 ae 
trial trial trial trial %, My, My, 
%, 


to bitter (0.5 mM quinine) and no-go to sweet (4mM AceK) and low salt 
(20 mM NaC]l). b, Quantitation of the responses from individual animals 
to quinine, AceK, salt and salt + light (n = 8, Mann-Whitney U-test, 
P<0.002). See also Fig. 4. 


Extended Data Figure 7 | Cross-generalization between orally supplied 

taste stimuli and photostimulation of the bitter cortex. a, Representative 
histograms illustrating cross-generalization between taste stimulation and 
photostimulation of the bitter cortical field. The mouse was trained to go 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature16148 


Drosophila Ilonotropic Receptor 25a mediates 
circadian clock resetting by temperature 


Chenghao Chen!*, Edgar Buhl**, Min Xu!, Vincent Croset*, Johanna S. Rees‘, Kathryn S. Lilley*, Richard Benton’, 


James J. L. Hodge? & Ralf Stanewsky! 


Circadian clocks are endogenous timers adjusting behaviour and 
physiology with the solar day’. Synchronized circadian clocks 
improve fitness? and are crucial for our physical and mental well- 
being®. Visual and non-visual photoreceptors are responsible for 
synchronizing circadian clocks to light*”, but clock-resetting is 
also achieved by alternating day and night temperatures with only 
2-4°C difference®*. This temperature sensitivity is remarkable 
considering that the circadian clock period (~24h) is largely 
independent of surrounding ambient temperatures!*, Here we show 
that Drosophila Ionotropic Receptor 25a (IR25a) is required for 
behavioural synchronization to low-amplitude temperature cycles. 
This channel is expressed in sensory neurons of internal stretch 
receptors previously implicated in temperature synchronization 
of the circadian clock’. IR25a is required for temperature- 
synchronized clock protein oscillations in subsets of central clock 
neurons. Extracellular leg nerve recordings reveal temperature- 
and IR25a-dependent sensory responses, and IR25a misexpression 
confers temperature-dependent firing of heterologous neurons. We 
propose that IR25a is part of an input pathway to the circadian clock 
that detects small temperature differences. This pathway operates 
in the absence of known ‘hot’ and ‘cold’ sensors in the Drosophila 
antenna!®"" revealing the existence of novel periphery-to-brain 
temperature signalling channels. 

In Drosophila, daily activity rhythms are controlled by a network 
of ~150 clock neurons expressing the clock genes period (per) and 
timeless (tim). These encode repressor proteins that negatively feed- 
back on their own promoters resulting in 24h oscillations of clock 
molecules. Temperature cycles (TC) synchronize molecular clocks 
present in peripheral appendages in a tissue-autonomous manner®”?, 
whereas synchronization of clock neurons in the brain mainly depends 
on peripheral temperature receptors located in the chordotonal organs 
(ChO) and the ChO-expressed gene nocte”!*, 

To discover novel factors involved in temperature entrainment, we 
identified NOCTE-interacting proteins by co-immunoprecipitation 
and mass-spectrometry (Extended Data Table 1)'*. We focused on 
IR25a, a member of a divergent subfamily of ionotropic glutamate 
receptors and verified the interaction by co-immunoprecipitation 
after overexpressing IR25a and NOCTE in all clock cells using tim-gal4 
(Extended Data Fig. 1a). IR25a is expressed in different populations 
of sensory neurons, including those in the antenna and labellum!>~!’. 
In the olfactory system IR25a acts as a co-receptor with different 
odour-sensing IRs"°. 

To investigate if IR25a is co-expressed with nocte in ChO, we ana- 
lysed IR25a expression in femur and antennal ChO using an IR25a- 
gal4 line!> (Extended Data Fig. 2a). IR25a-gal4-driven mCD8-GFP 
labelled subsets of ChO neurons in the femur, overlapping substan- 
tially with nompC-QF driven QUAS-Tomato signals (using the QF 
binary transcriptional activation system) (Fig. la—c). nompC-QF is 


expressed in larval ChO!8 and in the adult femur ChO (Fig. 1d, e). 
Comparison of IR25a-driven mCD8-GFP and nuclear DsRed sig- 
nals with those of other ChO neuron drivers (F-gal4 and nocte-gal4 
(ref. 9)) suggests that IR25a is expressed ina subset of femur ChO neurons 
and Johnston’s Organ (JO) neurons (Fig. 1c and Extended Data 
Fig. 1b-g). To determine if IR25a-gal4 ChO signals reflect endoge- 
nous IR25a expression, we confirmed the presence of IR25a mRNA in 
the femur and leg (Extended Data Fig. 2b, e) and the co-localization 
of anti-IR25a immunofluorescence signals in femur ChO neurons 
(Fig. 1f, g). IR25a was detected in ChO neuron cell bodies and cili- 
ated dendrites, as was an mCherry-IR25a fusion protein expressed 
in these cells (Fig. 1h). 

As nocte' mutants do not synchronize to 12h:12h 16 °C:25°C 
temperature cycles in constant light (LL)? (Extended Data Fig. 3a), 
we analysed IR25a /~ mutants’© under these conditions. Unlike 
nocte!, the IR25a~'~ flies synchronized well to this regime and we 
obtained similar results at warmer temperature cycles (Extended Data 
Fig. 3a). To test whether IR25a is specifically required for synchro- 
nization to small temperature intervals”!>, we subjected IR25a7!— 
flies to various temperature cycles with an amplitude of only 2°C. 
Surprisingly, and in contrast to wild-type, IR25a_‘~ mutants did not 
synchronize to any of the shallow temperature cycles in LL or con- 
stant darkness (DD) (Fig. 2a-e and Extended Data Figs 3b and 4c). 
In LL, wild-type and IR25a rescue flies showed a clear activity peak 
in the second part of the warm period before and after the 6h shift 
of the temperature cycle. By contrast, IR25a_‘~ mutants were con- 
stantly active throughout the temperature cycle, apart from a short 
period of reduced activity at the beginning of the warm phase of TC1 
(Fig. 2a and Extended Data Fig. 3b). In DD, control flies slowly 
advanced (or delayed) their evening activity peak during phase-ad- 
vanced (or delayed) temperature cycles (Fig. 2b and Extended Data 
Fig. 4c). The phase of this activity peak was maintained in the subse- 
quent free-running conditions (DD, constant 25 °C) indicating stable 
re-entrainment of the circadian clock (Fig. 2b and Extended Data 
Fig. 4). By contrast, JR25a mutants did not shift their evening peak 
during the temperature cycle, keeping their original phase throughout 
the experiment (Fig. 2b and Extended Data Fig. 4c). 

To quantify entrainment in LL, we determined the ‘entrainment 
index’ (EI), whereas for most DD experiments we calculated the 
phase difference of the main activity peak upon release into constant 
conditions between IR25a mutants and controls. In all 2°C amplitude 
temperature cycles tested the entrainment index of IR25a~/~ flies was 
significantly lower and phase calculation indicated no phase shift or 
a significantly reduced phase shift compared to controls (Fig. 2c-e). 
The same non-synchronization phenotype was observed in IR25a~/ 
Df(IR25a) flies, and temperature synchronization was fully restored in 
IR25a ‘~ rescue flies (Fig. 2a—d and Extended Data Fig. 3b). IR25a/~ 
mutants synchronize to light and have normal free-running and 


1Department of Cell and Developmental Biology, University College London, 21 University Street, London WC1E 6DE, UK. ?School of Physiology and Pharmacology, University of Bristol, 
University Walk, Bristol BS8 1TD, UK. 3Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland. “Cambridge Centre for 
Proteomics, Department of Biochemistry and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge CB2 1QW, UK. 


*These authors contributed equally to this work. 


516 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


IR25a > mCD8-GFP;nompC-QF > QUAS-Tomato 


nompC-QF > QUAS-Tomato 


Matai RESEARC 


F >mCD8-GFP; Figure 1 | IR25a is expressed in ChO neurons. 


Coxa 


Trochanter 


‘> Femur 


Tibia 


IR25a > mCD8 GFP. 


a, Overview of the femur ChO adapted from). 

b, d, Double labelling of the femur ChO by IR25a- 
gal4 (b) and F-gal4 (d) driven mCD8-GFP and 
nompC-QF driven QUAS- Tomato. c, e, Higher 
magnification of circled ChO areas in b and d, 
respectively. f, IR25a immunolabelling of femoral 
ChO cryosections of [R25a-gal4/UAS-mCD8-GFP 
flies. From left to right, GFP, anti-IR25a, 22C10, 
and merged images are shown. g, Anti-IR25a 

and 22C10 labelling of femur ChO sections of 
IR25a~‘~ flies. h, Subcellular distribution of an 
mCherry-IR25a fusion protein co-labelled with 
the dendritic cap marker nompA-GFP in the 
femur ChO. Scale bar, 20|1m. 


IR25a7/- 


nompA-GFP, IR25a > mCherry-IR25a 


nompA-GFP 


temperature compensated periods (Fig. 2b, Extended Data Fig. 4d and 
Extended Data Table 2). These results suggest that IR25a enables the 
circadian clock to sense subtle temperature changes across the entire 
physiological range, rather than mediating synchronization to a specific 
range. Increasing the temperature cycle amplitude to 4°C consistently 
restored temperature entrainment in IR25a~'~ flies (Extended Data 
Fig. 4a, b). 

Temperature receptors located in fly antennae and arista are not 
required for temperature-synchronized behaviour®!"!”. As expected, 
we found that antennal [R25a function (Extended Data Figs 1c 
and 2a)!* is not required for temperature entrainment (Extended Data 
Fig. 5). To reveal the importance of IR25a expression in ChO neurons, 
we performed tissue-specific IR25a RNA interference (RNAi) using 
validated transgenes (Extended Data Figs 2d and 6a). IR25a RNAi in 
all or subsets of ChO neurons (Fig. 1 and Extended Data Fig. 1) resulted 
in a lack of entrainment (Extended Data Figs 2e and 6b, c). By contrast, 
IR25a RNAi in multidendritic, TRPA1-expressing or clock neurons did 
not impair temperature entrainment (Extended Data Fig. 6c). These 
findings are consistent with the absence of IR25a expression in clock 
neurons and the brain (Extended Data Fig. 2e-g) and show that IR25a 
functions in ChO neurons for temperature entrainment to 25 °C:27 °C 
temperature cycles in LL. 

To identify the neural substrates underlying the lack of behavioural 
synchronization, we quantified clock protein levels in wild-type, 
IR25a~'~, and IR25a~’~ rescue flies exposed to a shallow tempera- 
ture cycle in LL. Although TIM expression was robustly rhythmic and 
synchronized in all clock neuronal groups in controls, TIM was barely 
detectable in the Dorsal Neuron 1 (DN1) and DN2 of IR25a~'~ flies 
(Fig. 3a and Extended Data Fig. 7a, b). Moreover, in the small and large 
ventral lateral neurons (s-LNv and 1-LNv), TIM expression exhibited 
an additional peak during the warm phase (Fig. 3a and Extended Data 
Fig. 7a, b). In the DN3, TIM declined earlier compared to controls and 


mCherry-IR25a 


& 
4 Cilia 


Dendritic 
KK cap 


there was no effect on the dorsal lateral neurons (LNd). In temperature 
cycles and DD, TIM levels in DN1 were also blunted but oscillations 
in the DN2 and DN3 were similar to controls. In contrast to LL, TIM 
did not oscillate in any of the LN groups and was at constantly low 
levels (Fig. 3b), consistent with the behavioural results obtained under 
these conditions (Fig. 2b, d). The alterations of TIM expression are 
temperature specific, as we observed normal oscillations in LD cycles 
at 25°C (Extended Data Fig. 7c). An increase of the temperature cycle 
amplitude to 4°C also restored normal TIM expression in IR25a~/~ 
flies, in agreement with the behavioural rescue (Extended Data 
Figs 4a, b and 7d). In summary, in low-amplitude temperature cycles, 
IR25a is required for normally synchronized TIM oscillations in DN1-3 
and LNv in LL and in DN1 and LN clock neurons in DD. 

We tested if the clock neurons affected by the lack of IR25a are 
indeed involved in regulating behavioural synchronization to 
shallow temperature cycles by blocking synaptic transmission using 
tetanus-toxin (TNT). Indeed, TNT-expression in DN1 and DN2 
blocked synchronization in LL, whereas in DD only DN1 blockage 
interfered with temperature entrainment (Fig. 3c, d)?°. Consistent 
with the differential effect on TIM oscillations in LL and DD 
(Fig. 3a, b) these results strongly suggest that IR25a is required for the 
synchronized output of the DN1 (LL and DD) and DN2 (LL) to control 
temperature-entrained behaviour. 

Next, we asked if ChO might directly sense temperature in an 
IR25a-dependent manner. We recorded leg nerve activity in restrained 
preparations and identified ChO units in the compound signal 
(Fig. 4a). In both wild-type and IR25a~'~ flies, spontaneous leg 
movement changed as a function of temperature along with motor 
and sensory activity. Additionally, presumed ChO activity of wild- 
type flies also increased during periods without movement (Fig. 4b, 
third insert). This temperature-induced but movement-independent, 
ChO activity was absent in IR25a~‘~ flies, showing that temperature 


26 NOVEMBER 2015 | VOL 527 | NATURE | 517 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a LL 25 °C -LL + TC 25:27 °C b 


LD 25 °C -DD + TC 25:27 °C 


IR25a~ 


IR25a~ Rescue 


= 7S 
<= 


i 


Rescue Control 


Control 


aa 


28 


32 100 
aA h [ 
il | Fw 
40 23 


120 21 160 


els 


~ 
° ro) 
is nm 

8 

a 

° 3 
<_ 

by 

ry 
° So 
a 


25:27 °C 
8h delay 


21:23 °C 


c 25:27 °C 
8h advance 


7h advance 


1.0 NS 


a 


El 

Oo Oo =) 

sb o @ 
[*) 
@ 
= * 
ayy 

* 

By 
[) 
ny 
o 
= 4 

aly 

tel H? 
nocte N = 
ny 
ny 
A phase (h) compared to /R25a7~ 
Oo ny sb 
rescue i 
x 


tek 


tek 


* 

: 

Gontrot | eee * 
: 
: 


23 
Bet ge tee i 8 g 8 
a=) * Oo so et is B= 5 = 
—€ €@ 6 @ § §& NY is) r= B e 
Sc eee Fees 8 gE 8 
= Q 
9 fs 
a 
ge 
e eS 
18:20 °C 21:23 °C 25:27 °C 
NS NS 
1.0 NS Ns NS 
0.8 
a ek 
06 ek 
RK = 
= 
0.4 
21] 116] 16] }25] }22] |19] |38) |30} {38 
- - 
> 4+ @ os 4+ 2 GF + BQ 
eee REE RG 
6s 4 $8 6 YQ $$ GF NN B 
56 €& € 6 &£ ze 6 £ e€ 


Figure 2 | IR25a is required for temperature synchronization to 
low-amplitude temperature cycles. a, Upper part shows double plotted 
average actograms depicting the daily activity levels and environmental 
conditions during the entire experiment. White areas, LL and 25°C; orange 
areas, LL and 27°C. Histograms show daily average activity levels during 
the initial LL treatment and the last 3 days of each temperature cycle. 
Light orange, 25 °C; dark orange, 27 °C; white bars, activity levels in LL. 
Error bars indicate s.e.m.; numbers (7) in the upper-right corner; x axis, 
Zeitgeber time (h) and y axis total activity (beam crossings per 30 min). 
b, As in a but flies were initially kept in LD 25°C, before being exposed to 
a 7h phase advanced temperature cycle in DD (dark histogram bars) and 
free-running conditions (DD and 25°C). Actogram shading as in a but 
grey areas indicate darkness. Green and red arrows indicate the position 
(phase) of the main activity peak during the final free run for control and 
mutant flies, respectively. c, e, Entrainment index values (mean + s.e.m.) 
during 25°C:27 °C temperature cycles in LL (delay as in a) (c), and as 
indicated in e (all delay, except 25 °C:27 °C, advance) (see Extended Data 
Fig. 3b for actograms and daily average plots). In c, per” and nocte! 

flies were used as negative controls. ***P < 0.001, **P < 0.01, NS, not 
significant, one way ANOVA followed by Bonferroni correction. 

d, Phase difference during DD and constant temperature after temperature 
cycles between IR25a~/~ (n= 12/11/12 for 7h advance/8 h delay/8 h 
advance temperature cycles, respectively) and y w control (n= 16/10/14, 
respectively) and IR25a~'~ rescue flies (n= 16/18/12). ****P < 0.0001, 
*** DP < 0.001, **P < 0.01; F-statistic (Watson—Williams—Stevens test). 


is sensed in the legs in an IR25a-dependent manner (Fig. 4c). To test 
if IR25a contributes directly to temperature-sensing, we ectopically 
expressed this channel in the physiologically well-characterized, 
IR25a-negative, |-LNv (Extended Data Fig. 2f). As a positive control, 
we also expressed the temperature-sensitive Drosophila TRPA1 chan- 
nel”! in the ]-LNv. Isolated brains were exposed to a temperature ramp, 
and spike frequency of individual l-LNv was recorded. Control ]-LNv 
did not show a significant temperature-dependent change in neural 
activity (Fig. 4d). As expected, the firing rate of TRPA1 expressing 


518 | NATURE | VOL 527 | 26 NOVEMBER 2015 


—@ CantonS -® /R25a/~ 


DN1 DN2 


1.0 
0.8 we 


s-LNv 


chy fact [ct 


0 4 8121620240 4 8 1216 20240 ra 8 121620240 4 8121620 240 4 812 1620 24 0 4 8 12 16 20 24 


I-LNv 


Relative TIM level 
° 
a 


b 
_ DN1 s-LNv I-LNv 
B10 
S 08 
Ee 06 
5 0.4 
& 0.2 
@ 0.0 
Ka AN 4 10 16 22 22 4 10 16 22 4 10 16 22 4 10 16 22 
c IMP-TNT TNT 1.0  IMP-TNT:) «a TNT 
LT 
DN1> + we 
= 
DN2> 20 21 18 20 
1 r 1 1 
DN1> DN2> 
kK 67 
4 5 
d tock 
= 
DN1> 2 4 
mol 
g tok 
Qa 
E 
6 
ea 
DN2> = NS 
3 
8 
£ 
[om 
<a 0 
DN1>TNT — DN2>TNT 


Figure 3 | IR25a is required for clock protein oscillations in central 
clock neurons. a, b, TIM levels in clock neurons during LL (a) and DD 
(b) 25°C:27°C temperature cycles at the indicated time points Zeitgeber 
time (ZT). At least 8 brain hemispheres per time point were analysed for 
each genotype. Error bars indicate s.e.m. c, Progeny of UAS-IMP-TNT 
and UAS-TNT females crossed to Clk4.1M-gal4 (DN1>, upper panel) or 
CIk9M-gal4; Pdf-gal80 (DN2>, lower panel) males, were exposed to two 
6 h-delayed temperature cycles (12h at 25°C:12h at 27°C in LL). Left, 
actograms, shading as in Fig. 2a. Right, entrainment index calculations 
(mean + s.e.m.), numbers in bars indicate n. **P < 0.01; One way 
ANOVA followed by Bonferroni correction. d, Same genotypes as in c 
were exposed to an 8 h delayed 12h 25°C:12h 27 °C temperature 

cycles in DD. Left: actograms plotted as in Fig. 2b, Right: phase difference 
of activity peaks during final constant conditions between controls 
(DN1/DN2 > UAS-IMP-TNT, n= 9/12, respectively) and the indicated 
genotypes (DN1/DN2 > TNTE, n= 16/10). ****P < 0.0001, NS, not 
significant, F-statistic (Watson-Williams-Stevens test). 


neurons drastically increased linearly with temperature, as did other 
cellular parameters (Extended Data Fig. 8). IR25a expression resulted 
in a linear and reversible temperature-dependent increase in action 
potential firing frequency (Fig. 4e, i), whereas other cellular parame- 
ters showed no difference (Fig. 4f-h). Increasing the temperature by 
only 2-3 °C also lead to a reversible increase in firing frequency of 
1.03 + 0.20 Hz in IR25a expressing 1-LNv (Fig. 4j). By contrast, expres- 
sion of the related, but olfactory-specific co-receptor IR8a (which is 
not required for temperature entrainment, Fig. 2c) did not confer 
temperature-sensitivity (Extended Data Fig. 8). These observations 
suggest that IR25a is at least part of a thermosensory receptor required 
for temperature entrainment. 

Our data indicating that IR25a contributes to temperature sens- 
ing within ChO extend the roles of IR’s beyond chemoreception, 
reminiscent of the requirement for the ‘gustatory receptor’ Gr28b in 
warmth-avoidance~’. Although we show that IR25a-expressing leg neu- 
rons are capable of sensing temperature and mediating temperature 
entrainment, it is possible that this receptor has a similar role else- 
where in the peripheral nervous system. IR25a responds to small tem- 
perature changes and we propose that the fly continuously integrates 


© 2015 Macmillan Publishers Limited. All rights reserved 


Extend tibia 


Extend tibia 


Temperature 


Reference 


Coxa 20 
£40 
CKO Femur, E 
S 
-60 


ae 
Canton S w Extend tibia 


Firing rate (Hz) 


ota NOR 


LETTER 


Figure 4 | IR25a is required for temperature- 
induced leg nerve responses and confers 
temperature sensitivity to 1-LNv. a, Schematic of 
the setup. b, Recording of a control fly leg nerve 
including motor and sensory axons. The first 
extended insert shows a discharge of presumed 
ChO sensory units in response to manual 
extension of the tibia (green bars). Heating the 

| preparation from 20°C to 30°C (middle, red 

ww trace) lead to spontaneous leg movement with 
concurrent motor and sensory activity (second 
insert) but also to increased sensory firing in the 
absence of leg or motor activity (third insert), 
which was reversible with intact tibia extension 
response (fourth insert) (n =9). ¢, IR25a~!~ 
shows similar responses to tibia extension 

and temperature-dependent leg movement, 

but no sensory activity in response to elevated 
temperature (n =6). d, Whole-cell current clamp 
recordings of ]-LNv control and Pdf> IR25a 


Pdf > IR25a 


5 min 
Pdf > IR25a (n = 7) 
Pdf > + (n = 8) 


18 20 22 24 26 28 


f Vm (mV) g 


Temperature-induced response 


30 28 26 24 22 20 18 
Temperature (°C) 


Rin(Ga) hh 


2.0 


o Lk28) 


brains exposed to the indicated temperature 
ramp. e, Quantification of the temperature 
response from multiple recordings (mean, s.e.m.). 
f, Vm, membrane potential. g, Rin, input 
resistance. h, F, spontaneous firing rate at 18°C. 

i, Temperature coefficient Q10 > 4, **P< 0.01; 
t-test. Error bars indicate s.e.m., numbers (1) 


F(Hz) I 


8) ax 
4 
2 
oBlolo 


Q10 


18°C 30°C 18°C 30°C 


*.’ Motor units 


. , 


No temperature response 2s 


temperature signals received from multiple ChO across the whole body 
for synchronization of the clock. This potential reliance on weakly 
responding temperature receptors might explain why the Drosophila 
circadian clock is insensitive to brief temperature pulses”, which could 
help maintain synchronized clock function in natural conditions of 
rapid and large temperature fluctuations. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 14 April; accepted 26 October 2015. 
Published online 18 November 2015. 


Dunlap, J. C., Loros, J. J. & DeCoursey, P. J. Chronobiology: Biological 
Timekeeping (Sinauer Associates, 2004). 

Ouyang, Y., Andersson, C. R., Kondo, T., Golden, S. S. & Johnson, C. H. 
Resonating circadian clocks enhance fitness in cyanobacteria. Proc. Natl Acad. 
Sci. USA 95, 8660-8664 (1998). 

Bechtold, D. A., Gibbs, J. E. & Loudon, A. S. Circadian dysfunction in disease. 
Trends Pharmacol. Sci. 31, 191-198 (2010). 

Helfrich-Forster, C., Winter, C., Hofbauer, A., Hall, J. C. & Stanewsky, R. The 
circadian clock of fruit flies is blind after elimination of all known 
photoreceptors. Neuron 30, 249-261 (2001). 

Hughes, S., Jagannath, A., Hankins, M. W., Foster, R. G. & Peirson, S. N. Photic 
regulation of clock systems. Methods Enzymol. 552, 125-143 (2015). 

Brown, S. A., Zumbrunn, G., Fleury-Olela, F., Preitner, N. & Schibler, U. Rhythms 
of mammalian body temperature can sustain peripheral circadian clocks. 
Curr. Biol. 12, 1574-1583 (2002). 

Wheeler, D. A., Hamblen-Coyle, M. J., Dushay, M. S. & Hall, J. C. Behavior in 
ight-dark cycles of Drosophila mutants that are arrhythmic, blind, or both. 

J. Biol. Rhythms 8, 67-94 (1993). 

Maguire, S. E. & Sehgal, A. Heating and cooling the Drosophila melanogaster 
clock. Curr. Opin. Insect Sci. 7, 71-75 (2015). 


a | ; L558 25 
sas a T° 15 

va 15 
-60 1.0 

1.0 
0.5 

0.5 

oe 
IR25a Extend tibia -65 16 9 0 


18°C 18°C 


. Hamada, F. N. et a/. An interna 


written on the bars. j, IR25a expressing l-LNv 
also report small (2-3 °C) temperature changes 
leading to a ~1 Hz alteration of the instantaneous 
firing rate (n=5). 


18°C 18°C 


Pdf > IR25a 


Sehadova, H. et a/. Temperature entrainment of Drosophila’s circadian clock 
involves the gene nocte and signaling from peripheral sensory tissues to the 
brain. Neuron 64, 251-266 (2009). 


. Florence, T. J. & Reiser, M. B. Neuroscience: hot on the trail of temperature 


processing. Nature 519, 296-297 (2015). 


. Gallio, M., Ofstad, T. A., Macpherson, L. J., Wang, J. W. & Zuker, C. S. The coding 


of temperature in the Drosophila brain. Cell 144, 614-624 (2011). 


. Glaser, F. T. & Stanewsky, R. Temperature synchronization of the Drosophila 


circadian clock. Curr. Biol. 15, 1352-1363 (2005). 


. Wolfgang, W., Simoni, A., Gentile, C. & Stanewsky, R. The Pyrexia transient 


receptor potential channel mediates circadian clock synchronization to low 
temperature cycles in Drosophila melanogaster. Proc. R. Soc. Lond. B 280, 
20130959 (2013). 


. Rees, J. S. et al. In vivo analysis of proteomes and interactomes using parallel 


affinity capture (iPAC) coupled to mass spectrometry. Mol. Cell Proteomics 10, 
M110.002386 (2011). 


. Abuin, L. et a/. Functional architecture of olfactory ionotropic glutamate 


receptors. Neuron 69, 44-60 (2011). 


. Benton, R., Vannice, K. S., Gomez-Diaz, C. & Vosshall, L. B. Variant ionotropic 


glutamate receptors as chemosensory receptors in Drosophila. Cell 136, 
149-162 (2009). 


. Rytz, R., Croset, V. & Benton, R. lonotropic receptors (IRs): chemosensory 


ionotropic glutamate receptors in Drosophila and beyond. Insect Biochem. 
Mol. Biol. 43, 888-897 (2013). 


. Petersen, L. K. & Stowers, R. S. A Gateway MultiSite recombination cloning 


toolkit. PLoS ONE 6, e24531 (2011). 


. Sayeed, O. & Benzer, S. Behavioral genetics of thermosensation and 


hygrosensation in Drosophila. Proc. Natl Acad. Sci. USA 93, 6079-6084 (1996). 


. Kaneko, H. et al. Circadian rhythm of temperature preference and its neural 


control in Drosophila. Curr. Biol. 22, 1851-1857 (2012). 
thermal sensor controlling temperature 
preference in Drosophila. Nature 454, 217-220 (2008). 


. Ni, L. et al. A gustatory receptor paralogue controls rapid warmth avoidance in 


Drosophila. Nature 500, 580-584 (2013). 


. Busza, A., Murad, A. & Emery, P. Interactions between circadian neurons 


control temperature synchronization of Drosophila behavior. J. Neurosci. 27, 
10722-10733 (2007). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 519 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank P. Emery, J. Albert, J. Jepson, P. Garrity, and 

A. Samuel for discussions and sharing of unpublished results, J. Giebultowicz 
for anti-TIM antibodies, J. Albert and J. Jepson for fly stocks, C. Tardieu and 

R. Kavlie for help with qPCR, D. Carr for assistance with the temperature 
recording setup, and M. Ogueta-Gutierrez for help with figure preparations. 


The drawing for Fig. 4a was generated by Polygonal Tree (http://polygonaltree. 


co.uk/). This work was supported by BBSRC grants BB/H001204 to 

R.S., BB/JO-18589/-17221 to R.S. and J.J.L.H., and a CSC PhD fellowship 

to C.C. V.C. was supported by a Boehringer Ingelheim Foundation 
Fellowship. Research in R.B.’s laboratory was supported by European 
Research Council Starting Independent Researcher and Consolidator Grants 


520 | NATURE | VOL 527 | 26 NOVEMBER 2015 


(205202 and 615094). Mass spectrometry analysis was supported by 
Wellcome Trust grant 099135/Z/12/Z. 


Author Contributions C.C., E.B., R.S., J.J.L.H., R.B. and K.S.L. conceived, 
designed, and supervised the project. C.C., E.B., M.X., V.C., and J.S.R. performed 
experiments. C.C., E.B., and J.S.R. analysed data, and R.S. wrote the paper, with 
feedback from all authors. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
R.S. (r.stanewsky@ucl.ac.uk). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Plasmids and germline transformations. To generate the psp-flag-strep II-nocte-ha 
(FSNH) construct, a flag-strepII-venus-strepII (fsvs) fragment was amplified from a 
PiggyBac/P-element YFP-flag-strep II construct of'* using a Phusion High-Fidelity 
PCR kit (New England Biolabs). This 900 bp fragment was sub-cloned into psp73 
(Promega) to generate psp73-fsvs with BglII/Xholl sites. To introduce a strepII 
tag upstream of the NOCTE N terminus, a 0.5 kb fragment was amplified from 
psp73-nocte-HA (containing the entire nocte coding region fused to ha; Giesecke 
and Stanewsky, unpublished) by annealing the strep II tag directly using PCR. This 
fragment was religated back into psp73-nocte-ha to generate psp-strep I-nocte-ha. 
A 3x Flag tag was introduced 5! of psp73-strep I-nocte-ha by sub-cloning the 
strep I-nocte-ha fragment into psp-fsvs with BstBI/XholI sites replacing venus. 
To generate the psp-flag-strep I-nocte-strep II construct (FSNS), a Strep II tag was 
amplified and annealed 3’ of psp-nocte (Giesecke and Stanewsky, unpublished) to 
generate psp-nocte-strep II, followed by sub-cloning into psp-fsnh using BstBI/Xholl 
sites. FSNH and FSNS were sub-cloned into the transformation vector pUAST 
using BglII/Xholl sites, and transgenic flies were generated using classical transpo- 
sase-mediated germline transformation. To generate mCherry-IR25a, the coding 
sequence of IR25a lacking the endogenous signal sequence (starting from codon 
31) was PCR amplified, subcloned into pUAST-mCherry attB’, and integrated 
into attP2. To generate the IR25a genomic rescue construct, the bacterial artificial 
chromosome (BAC) CH322-32C20~ was integrated into attP16, and then recom- 
bined onto the JR25a” mutant chromosome. Restoration of IR25a expression by 
this BAC (in IR25a’, CH322-32C20/IR25a*, CH322-32C20 animals) was verified 
by immunostaining with anti-IR25a antibodies (data not shown). All constructs 
generated in this study were confirmed by DNA sequencing. 

Fly strains. Flies were kept at 25°C or 18°C on common cornmeal-yeast-sucrose 
food under light:dark cycles and 60-70 % humidity. As controls, wild-type Canton 
Sand y w flies (both carrying the Is-tim allele) were used. The following flies used 
in this study were previously described or obtained from the Bloomington Stock 
Center: tim-gal4:67 (ref. 25), Clock856-gal4 (ref. 26), F-gal4:33-5 (ref. 27), nocte-gal4 
(ref. 9) IR25a-gal4 (ref. 15), gmr-gal4 (BL1104), Pdf-gal4 (ref. 28), elav-gal4; UAS- 
dicer (BL25750), UAS-dicer (BL24646), trpA 1-gal4 (BL27593), nompC-gal4 (ref. 29), 
nompC-QF (BL36346), ppk-gal4 (BL32078), UAS-GFP (ref. 25), nompA-GFP™, 
UAS-mCherry (BL52268), QUAS-mtdTomato (BL30037), UAS-mCD8-GEP; UAS- 
DsRed?, Pdf-RFP*', UAS-TrpA1 (ref. 21)UAS-IR25a°, y per” w and y per’ w°?, 
IR25a_‘~: either homozygous IR25a? or IR25a7/IR25a' flies, both null mutant 
alleles generated by gene targeting!®, [R25a, CH322-32C20/IR25a’, CH322-32C20 
(ref. 15) (outcrossed to Canton S for six generations and here referred to as IR25a 
rescue), [R8a!: null allele of IR8a and referred to as IR8a~!~ !°, nocte!: encodes 
truncated version of the NOCTE protein?, UAS-TNT-E and UAS-IMP-TNT-V1-B 
(inactive). Clk4.1M-gal4 and Clk9M-gal4;Pdf-gal80 flies were used to direct GAL4 
expression to subsets of the DN1p and to the DN2, respectively”°*4. IR25a-RNAi 
lines 15627-R1 and 15627-R2 were obtained from the NIG-Fly Stock Center. wills, 
Df(2L)Exel6010/CyO was used as IR25a deficiency (BL7496). 

Immunostaining and quantification. GFP and/or RFP signals were analysed as 
described in’. Briefly, antennae and legs were fixed and dissected in 4% para- 
formaldehyde/PBS solution. Samples were then washed 3 times in 3% PBST at 
room temperature followed by mounting in Vectashield (Vector Labs) medium and 
inspected using a Leica TCS SP5 confocal microscope. To visualize endogenous 
IR25a expression in the ChO of fly antennae and legs, cryosections (16,1m) and 
immunolabelling were performed as described in*> with minor modifications: 
sections were collected on slides and fixed for 10 min in 4 % formaldehyde in PBS. 
After washing for 2 x 10 min in PBS, sections were treated for 30 min in PBS +- 0.1% 
Triton X-100 (PBT) and incubated in 5% normal goat serum (NGS) for 30 min. 
Primary antibodies (rabbit anti-IR25a 1:500 (ref. 16), mouse anti-22C10 1:200, 
DSHB) were diluted in PBT with NGS and applied to slides placed horizontally 
in humidified chambers and left for 2h at room temperature followed by incuba- 
tion overnight at 4°C. After washing for 3 x 10 min in PBT, slides were blocked 
in PBT with NGS for 30 min and incubated with secondary antibodies (rabbit 
AlexaFlour-594, 1:500, Mouse AlexaFluor-647 1:500, Invitrogen) diluted in PBT 
in the dark for 4h at room temperature. Slides were washed 3 x 5 min in PBS and 
mounted in Vectashield before observation. Immunostaining of whole-mounted 
brains was performed as described in*° with minor modifications. For LD experi- 
ments, flies were fixed on the fifth day of light entrainment. For temperature exper- 
iments, flies were first reared in LL and 25°C for 3 days, and then transferred to a 
25°C:27°C or 25°C:29°C temperature cycles for 7 days. Temperature cycles were 
rectangular and not ramped. Therefore, the conditions are consistent with those 
for behavioural analysis. For temperature entrainment in DD, flies were initially 
entrained to LD at 25°C for 2 days followed by a 25 °C:27 °C temperature cycles in 
DD that was shifted 8h in advance with respect to the previous LD cycle. Brains 
were dissected on day 6 of the temperature cycles at the indicated time points. 
Primary rat anti-TIM (1:1,000)°”, and secondary rat AlexaFluor-594 antibodies 


LETTER 


(Invitrogen, 1:500) were applied. Mounted brains were scanned using a Leica 
TCS SP5 confocal microscope. Quantification of TIM signals was performed as 
in** with minor modifications: Pixel intensity of stained neurons and background 
staining in each neuronal group was measured using ImageJ. Background signal 
was determined by taking the average signal of two surrounding fields of each neu- 
ronal group and was subtracted from the neuronal signal. For each group of clock 
neurons, at least 8 hemispheres from each genotype were checked and measured 
per time point. Data were normalized by setting the peak value to 1 and the ratio 
from each time point was then divided by the peak value. 
Co-immunoprecipitation. Co-immunoprecipitation experiments were performed 
as described". For each protein purification, 200-300 mg wet-weight of heads 
from gmr-gal4 flies expressing UAS-nocte-flag (FSNS and FSNH transgenics were 
used in 2 independent experiments) or gmr-gal4 alone (negative controls) were 
collected on dry ice and manually homogenized with a 2 ml Dounce homogenizer 
(Fisher) in 1 ml of extraction buffer (final protein concentration 5 mg ml! extrac- 
tion buffer) containing 50 mM Tris, pH 7.5, 125mM NaCl, 1.5mM MgCh, 1mM 
EDTA, 5% glycerol, 0.4% NP-40, and 0.1% Tween 20. To prevent degradation 
during the lengthy purification steps, 2x protease mini EDTA inhibitor mixture 
(Roche) was added at hourly intervals throughout the procedure. The homogenate 
was centrifuged at 10,000 r.p.m. for 15 min to isolate the soluble fraction used for 
pull-down. For the Flag pull-down procedure, EZview Red anti-Flag M2 affinity 
gel (Sigma) was used to bind the Flag-tagged bait and its bound partners. 50 11 pre- 
washed 50% slurry was added to 1 ml soluble protein and incubated at 4°C for 2h 
on a rotary mixer. Non-binding material was removed by centrifugation (8,000g 
for 2 min) and the resin was washed three times in ice-cold extraction buffer. For 
checking the interaction with IR25a (Extended Data Fig. 1a), the washed resin was 
directly boiled with 5x SDS loading buffer followed by routine western blot. For 
elution the isolated protein complexes, Flag-tagged protein with any associating 
proteins, was incubated and eluted three times each with 5011 (100 1g ml’) Flag 
peptide (Sigma) in extraction buffer for 30 min at 4°C on a rotary mixer. The 
three eluates were combined and any residual resin was removed by centrifugation 
at 8,000g for 2 min. The following mass spectrometry peptide sequencing was 
performed by Cambridge Centre for Proteomics. Briefly, eluates from the tagged 
line and untagged control flies, were processed as described", The only deviation 
from the method described was that peptides were applied to a 180,1m x 20mm 
(51m particle size) C18 trap column (Waters UPLC Trap Symmetry) coupled to 
a nanoAcquity UPLC system (Waters) using 0.1% formic acid in water (buffer A) 
at a flow rate of 10 11 min~!. Peptides were then separated on a 75\um x 250mm 
(1.7|.m particle size) reverse phase BEH C18 analytical nano-column (Waters) at a 
flow rate of 300nl min“! using a gradient of buffer A and buffer B (0.1% formic acid 
in acetonitrile). The HPLC system was directly coupled to a LTQ Orbitrap Velos 
(Thermo Scientific) with a New Objective nanospray ionisation source operated 
at a resolution of 60,000. Peptides were eluted with a linear gradient of 5-45% 
buffer B over 45 min or with a re-equilibration step, giving total running times of 
60 min. The Orbitrap analyser survey scan was performed over a mass range of 
m/z 380-1,500 each of them triggering 10 MS2 LTQ acquisitions of the ten most 
intense ions exceeding 500 counts using a data dependent acquisition mode. 
Western blot. For confirming the interaction between NOCTE and IR25a total 
head proteins were isolated from flies expressing IR25a, or IR25a and Flag-tagged 
NOCTE, under the control of tim-gal4. Boiled beads (after Co-IP) were loaded 
on SDS-PAGE gels, followed by standard western blot. Primary rabbit anti-IR25a 
1:5,000 (ref. 16) and mouse anti-Flag M2 1:1,000 (Sigma), and secondary HRP- 
conjugated goat anti-rabbit IgG-HRP (1:10,000) and goat anti-mouse IgG-HRP 
(1:1,000) antibodies (Jackson) were used. 

RNA isolation and RT-PCR. For RNA extractions, 30-50 flies were collected in 
2 ml RNAlater (Ambion) and kept at 4°C overnight, and 100 11 0.1% PBST was 
added to help RNAlater penetration. Femurs from around 200 fly legs and 50 
retinas were quickly dissected in cold RNAlater. Total RNA was extracted using an 
RNEasy kit (QIAGEN) according to the manufacturer’s instructions. Total RNA 
was finally eluted in RNase-free water and stored at —80°C. cDNA synthesis was 
performed with Reverse Transcription Reagents Kit (Applied Biosystems) in 1011 
reactions using 11g of total RNA according to the manufacturer's instructions. 
To verify mRNA expression level of IR25a and nocte in fly femur ChO, dilutions 
of cDNA were used for PCR with the following primers: rp49 and nocte (ref. 9), 
IR25a (ref. 39), followed by DNA electrophoresis on 2% agarose gels to visualize 
the PCR products. To test the efficiency of IR25a RNAi, 20 heads of 5-10-day-old 
flies were dissected and RNA was extracted and reverse transcribed as described 
above. Taqman probes for IR25a (catalogue number 4351372, ThermoFisher) and 
RPL32 (catalogue number 4448489, ThermoFisher) were applied to determine 
the amount of mRNA. For determining IR25a mRNA levels in different tissues, 
body parts were dissected in RNA later (Ambion). 1 jug RNA was used for cDNA 
synthesis. Real-time assays were performed using an ABI GeneAMP PCR system 
9700 using the standard program, and C;, (threshold cycle) values were applied to 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


determine the amount of RNA in each genotype. The relative concentrations were 
calculated using the 27 AAC method, and RPL32 was used as control. 
Behavioural analysis. Analysis of locomotor activity of 4-5-day-old male flies was 
performed using the Drosophila Activity Monitor System (DAM, Trikinetics). The 
DAM monitors, as well as an environmental monitor (Trikinetics), were located 
inside a light- and temperature-controlled incubator where the fly’s activity was 
monitored for a few weeks depending on different experimental conditions. 
Plotting of behavioural activity and period calculations were performed using a 
signal-processing tool-box*’ implemented in Matlab (MathWorks). In order to 
quantify behaviour during temperature cycles, an updated Histogram version 
based on Excel (Office, Microsoft) was applied**. Briefly, the activity from the 
last two days of each temperature cycle was plotted in Excel in 30-min bins and 
an ‘entrainment index’ (EI = ratio of activity occurring during the 6 h window 
covering the main activity peak of the positive controls over the activity during the 
entire warm phase) was calculated. To distinguish the clock-controlled behavioural 
peaks from temperature response peaks, a simple smoothing filter was applied for 
the four activity bins during the 2h following each temperature transition**. The 
filtered data was used for calculation, whereas the raw activity data are plotted 
in the histograms. The entrainment index values plotted in all histograms repre- 
sent the average entrainment index from the temperature cycles before and after 
the shift except for 18°C:20°C and 21°C:23 °C, where the entrainment index was 
generated from the temperature cycles after the shift. To calculate the phase of the 
main activity peaks after DD and temperature cycles involving genotypes that did 
not show clear activity peaks during entrainment, we employed circular phase 
plot analysis as previously described**". In brief, the mean activity phase of the 
three consecutive days after release into constant conditions was determined for 
each fly of the two genotypes to be compared. An average ‘vector’ indicating phase 
coherence (length) and mean peak phase (direction) is calculated for each genotype 
and the two vectors are compared by an F-statistic (Watson-Williams—Stevens 
test). The difference in direction is plotted in hours (h) and the mean peak phase 
of the controls (negative or positive controls, depending on the experiment) was 
set to zero. 

Electrophysiology. Extracellular leg nerve recording. The question of whether ChO 
in the legs would respond to temperature changes was examined using extracellular 
recordings in restrained intact leg preparations. Canton S w* flies were used 
as a control and IR25a~/~ mutants were used to test if IR25a is involved in this 
process. In order to minimize locomotory artefacts, flies were decapitated and 
all legs but the left hind leg amputated. Flies were mounted ventral side up and 
pinned down in a Sylgard (Dow Corning, USA) coated recording chamber so that 
the left hind leg was orientated perpendicular to the body but not immobilized 
(Fig. 4a). Therefore the ChO could be stimulated in vivo by moving the tibia with 
a fine needle. A tungsten wire electrode, sharpened to a fine point, was inserted 
through the cuticle in the thorax served as a reference electrode and a similar 
recording electrode was placed in the coxa of the remaining leg. The final position 
of the recording electrode was determined by monitoring the signal it was record- 
ing and then manually extending the tibia until a response in sensory units was 
seen. The signal was amplified using a BioAmp extracellular amplifier, filtered (low 
5 kHz, high 10 Hz), digitized (sampling frequency 10 kHz) with a PowerLab 2/20 
and recorded using LabChart 7 (ADInstruments, Bella Vista, Australia). 
Whole-cell recordings. Different genotypes were used for each group and the data 
pooled as there were no differences between them: control, Pdf-RFP and Pdf-gal4; 
UAS-mCherry; IR25a, Pdf-gal4/UAS-IR25a; Pdf-RFP and Pdf-gal4/UAS-mCherry- 
IR25a; TrpA1, Pdf-gal4/UAS-TrpA 1; Pdf-RFP and Gal1118-gal4/UAS-TrpA 1/UAS- 
mCherry-IR8a; Pdf-gal4/UAS-IR8a; Pdf-RFP and Pdf-gal4/UAS-mCherry-IR8a. 
Experiments were performed under red light illumination and light exposure 
during dissection was kept to a minimum. For visualization of the l-LNv we used 
RFP-tagged constructs and a 555 nm LED light source in order to not activate 
cryptochrome. Adult flies raised in 12 h:12h LD at 25°C, were collected ~3-5 
days post eclosion between ZT13 and ZT16, decapitated and brains dissected in 
extracellular saline solution containing (in mM): 101 NaCl, 1 CaCh, 4 MgCh, 
3 KCI, 5 glucose, 1.25 NaH»POu, 20.7 NaHCOs, pH adjusted to 7.2. The brains 
were transferred for 5-10 min to saline containing 20 U per ml papain with 1mM 
L-cysteine to digest the ganglion sheath. After removal of the photoreceptors, air 
sacks and trachea, a small incision was made over the position of the |-LNv neu- 
rons in order to give easier access for the recording electrodes. Brains were placed 
ventral side up in the recording chamber, secured using a custom-made anchor and 
during recordings continuously perfused with aerated (95% O32, 5% COz) saline 
solution. I-LNv neurons were identified on the basis of their fluorescence, size and 
position. A single recording was performed from one l-LNv per brain. Whole-cell 
current clamp recordings were performed using glass electrodes with 10-20 MQ. 
resistance filled with intracellular solution (in mM: 102 K-gluconate, 17 NaCl, 0.94 
EGTA, 8.5 HEPES, 0.085 CaCh, 1.7 MgCl, pH 7.2) and an Axon MultiClamp 700B 


amplifier, digitized with an Axon DigiData 1440A (sampling rate: 20 kHz; filter: 
Bessel 10 kHz) and recorded using pClamp 10 (Molecular Devices, USA). A cell 
was included in the analysis if the access resistance was less than 70 MQ. and the 
leak current in response to a —40 mV pulse less than —100 pA. All chemicals were 
purchased from Sigma (Poole, UK). The liquid junction potential was calculated 
as 13 mV and was subtracted from all the membrane voltages. Resting membrane 
potential (Vm) was measured after stabilizing for 2-3 min. Membrane input resist- 
ance (Rin) was calculated by injecting hyperpolarizing current steps and measuring 
the resulting changes in voltage. Spike frequency was manually measured using 10s 
bins for each degree of temperature. To test the effect of elevated temperature, the 
recording chamber and the perfusion influx were gradually heated from 18°C to 
30°C within 5-10 min and cooled back to 18°C within 10-15 min using a Peltier 
heating system (ALA Scientific Instruments, USA) and TC-10 controller (npi, 
Tamm, Germany). The temperature coefficient Q10 was calculated by dividing the 
firing rate at 30°C by the rate at 20°C. To check whether ]-LNvs can also sense small 
temperature changes of 2-3 °C, neurons were recorded as before, the temperature 
increased to around 24.5 °C held for 3 min, then increased to 27.5°C, held again for 
3 min, cooled back down to 24.5 °C and recorded for a further 3 min. During the 
whole period the instantaneous spiking frequency was monitored. All values are 
given as mean and s.e.m. and a t-test and ANOVA (followed by Tukey test) were 
used to calculate significant differences. 

Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 


24. Venken, K. J. et a/. Versatile Placman] BAC libraries for transgenesis studies in 
Drosophila melanogaster. Nature Methods 6, 431-434 (2009). 

25. Kaneko, M. & Hall, J. C. Neuroanatomy of cells expressing clock genes in 
Drosophila: transgenic manipulation of the period and timeless genes to mark 
the perikarya of circadian pacemaker neurons and their projections. J. Comp. 
Neurol. 422, 66-94 (2000). 

26. Gummadova, J. O., Coutts, G. A. & Glossop, N. R. Analysis of the Drosophila 
Clock promoter reveals heterogeneity in expression between subgroups of 
central oscillator cells and identifies a novel enhancer region. J. Biol. Rhythms 
24, 353-367 (2009). 

27. Kim, J. et al. A TRPV family ion channel required for hearing in Drosophila. 
Nature 424, 81-84 (2003). 

28. Park, J. H. & Hall, J. C. Isolation and chronobiological analysis of a 
neuropeptide pigment-dispersing factor gene in Drosophila melanogaster. 

J. Biol. Rhythms 13, 219-228 (1998). 

29. Liu, L. et al. Drosophila hygrosensation requires the TRP channels water witch 
and nanchung. Nature 450, 294-298 (2007). 

30. Chung, Y. D., Zhu, J., Han, Y. & Kernan, M. J. nompA encodes a PNS-specific, 
ZP domain protein required to connect mechanosensory dendrites to sensory 
structures. Neuron 29, 415-428 (2001). 

31. Ruben, M., Drapeau, M. D., Mizrak, D. & Blau, J. A mechanism for circadian 
control of pacemaker neuron excitability. J. Biol. Rhythms 27, 353-364 (2012). 

32. Konopka, R. J. & Benzer, S. Clock mutants of Drosophila melanogaster. 

Proc. Nat! Acad. Sci. USA 68, 2112-2116 (1971). 

33. Sweeney, S. T., Broadie, K., Keane, J., Niemann, H. & O’Kane, C. J. Targeted 
expression of tetanus toxin light chain in Drosophila specifically eliminates 
synaptic transmission and causes behavioral defects. Neuron 14, 341-351 
(1995). 

34. Zhang, Y., Liu, Y., Bilodeau-Wentworth, D., Hardin, P. E. & Emery, P. Light and 
temperature control the contribution of specific DN1 neurons to Drosophila 
circadian behavior. Curr. Biol. 20, 600-605 (2010). 

35. Saina, M. & Benton, R. Visualizing olfactory receptor expression and 
localization in Drosophila. Methods Mol. Biol. 1003, 211-228 (2013). 

36. Yoshii, T., Todo, T., Wulbeck, C., Stanewsky, R. & Helfrich-Forster, C. 
Cryptochrome is present in the compound eyes and a subset of Drosophila’s 
clock neurons. J. Comp. Neurol. 508, 952-966 (2008). 

37. Rush, B. L., Murad, A. Emery, P. & Giebultowicz, J. M. Ectopic CRYPTOCHROME 
renders TIM light sensitive in the Drosophila ovary. J. Biol. Rhythms 21, 
272-278 (2006). 

38. Gentile, C., Sehadova, H., Simoni, A., Chen, C & Stanewsky, R. Cryptochrome 
antagonizes synchronization of Drosophila’s circadian clock to temperature 
cycles. Curr. Biol. 23, 185-195 (2013). 

39. Croset, V. et al. Ancient protostome origin of chemosensory ionotropic 
glutamate receptors and the evolution of insect taste and olfaction. PLoS 
Genet. 6, €1001064 (2010). 

40. Levine, J. D., Funes, P., Dowse, H. B. & Hall, J. C. Signal analysis of behavioral 
and molecular cycles. BMC Neurosci. 3, 1 (2002). 

41. Simoni, A. et al. A mechanosensory pathway to the Drosophila circadian clock. 
Science 343, 525-528 (2014). 

42. Wilson, R. |. & Corey, D. P. The force be with you: a mechanoreceptor channel 
in proprioception and touch. Neuron 67, 349-351 (2010). 

43. Stanewsky, R. et a/. Temporal and spatial expression patterns of transgenes 
containing increasing amounts of the Drosophila clock gene period and a lacZ 
reporter: mapping elements of the PER protein involved in circadian cycling. 
J. Neurosci. 17, 676-696 (1997). 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 
tim > IR25a ie = 
tim > IR25a + FLAG Nocte - + 
anti-FLAG a ] 
ve 


Input 


IR25a > mCD8-GFP, Ds-Red-NLS 


Nocte 


IR25a 


LETTER 


+ = tim > IR25a 
- + tim > IR25a + FLAG Nocte 
rn = 120kDa 
IP: anti-FLAG 


F-Gal4 > mCD8-GFP, 
Ds-Red-NLS 


2nd antennal segment 
(Johnston’s Organ) 


Femur 


Extended Data Figure 1 | IR25a and Nocte physically interact in vivo 
and are expressed in femur and antennal ChO neurons. a, In vivo 


co-immunoprecipitation experiments using protein extracts from fly heads. 


Head lysates were immunoprecipitated using anti-Flag antibody. The 
immunoprecipitates were examined by western blotting using anti-Flag 
and anti-IR25a antibody. Input represents 30% of cell lysates used in the 
pull-down experiment. The genotypes of the flies used were: tim > IR25a: 
UAS-GFP/UAS-IR25a; tim-gal4:67/+-. tim > IR25a+-FLAG-NOCTE: 
UAS-FSNH/UAS-IR25a; tim-gal4:67/--. The bracket indicates that 
NOCTE-Flag runs as a double band on western blots. For uncropped gel 


images, see Supplementary Fig. 1. b, Overview of the antennal and 

femur ChO adapted from refs 13, 42. c, d, Labelling of the JO neurons 

by IR25a-gal4 (c) and F-gal4 (d) driven membrane bound mCD8-GFP 
and nuclear-localized DsRed expression. Note that IR25a is expressed in 
only a subset of JO neurons. e, f, Same flies as in c, d analysed for IR25a 
expression in the femoral ChO. Again, only subsets of the ChO neurons 
express IR25a. Arrows in c, e point to ChO neuron nuclei. g, Labelling of 
the ChO neurons in JO and femur by nocte-gal4-driven membrane-bound 
GFP and nuclear DsRed expression’. Scale bar, 201m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a GFP 


IR25a 


IR25a > mCD8-GFP 


arista 


sacculus 


sensilla: 
7 basiconic 


Fs trichoid 


+ coeloconic 


b CantonS IR25a* 
a 4 
Se, 
Bx vA 
pee 7 
d 
15 
< 
Zz 
“ 
wo 1.0 
wo 
N 
a 
o 
2s 
8 
Oo 
x hee 
0.0 
CantonS IR25aR1 IR25aR2 IR25aR1R2 
UAS-dicer2;elav-Gal4 
f Canton S 


22C10 


Bright field 


c 


Retina 
[_—_- = — 
— i 
CantonS IR25a~ 


™@™ Canton S 
© UAS-dicer;F-Gal4 > IR25aR1R2 
UAS-dicer;GMR-Gal4 > IR25aR1R2 


Relative IR25a RNA 


Leg 
Leg 
Retina 
Retina 
Brain 


2 2 
g ¢ 
fo} i=] 
2 oO 
o ff 
a < 


9g IR25a > Ds-Red-NLS 


Extended Data Figure 2 | Spatial and quantitative IR25a mRNA and 
protein expression in CNS and PNS tissues and efficiency of RNAi- 
mediated knockdown. a, Analysis of IR25a-gal4 and IR25a in the third 
segment of the antenna reveals expression in coeloconic sensilla’®. 
Schematic adapted from”. b, c, Determination of IR25a and nocte mRNA 
levels in femur and retinal tissues by semiquantitative RT-PCR; rp49 

was used as control. For uncropped gel data, see Supplementary Fig. 1. 

d, e, qPCR analysis of IR25a mRNA levels in whole heads (d), or dissected 
body parts (as indicated) (e) from flies of the genotypes indicated. 
Pan-neuronal elav-gal4 knockdown (d) decreased IR25a mRNA >75% 

or >90%, using one or two different RNAi lines combined, respectively. 
**E* P< 0.0001, ***P < 0.001, *P < 0.05, one-way ANOVA followed by 


Bonferroni correction. f, IR25a is not expressed in the central brain and 
clock neurons. Left, IR25a immunolabelling of a Canton S brain reveals no 
signals. Middle, same brain labelled with anti-TIM reveals expression 

in clock neurons. Right, merge. Brains were dissected in LD at ZT20. 

Scale bar, 10 jum. g, IR25a-gal4 is not expressed in clock neurons and largely 
absent from the brain. Left, nuclear DsRed driven by IR25a-gal4. Second 
from left, anti-PDF staining showing LNv and their projections. Middle, 
anti-PER (diluted 1:5,000)** showing all clock neurons. Second from right, 
merge, showing two IR25a-galé4 positive cell in the antennal lobe, not 
co-localized with any of the clock neurons. These cells were observed 

in 4/8 hemispheres and always on the same side of the brain. Right, 
magnified view of circled area in the merged image. Scale bar, 301m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Canton S IR25a’ 
LL 
25 °C 
LL+TC 
25:16 °C 
Canton S IR25a“ rescue 


Extended Data Figure 3 | IR25a is required for temperature 
synchronization to low-amplitude temperature cycles but not for 
high-amplitude temperature cycles. a, Canton S, IR25a~/~, and nocte! 
flies were exposed to LD at 20°C for 5 days (left) or LL at 25°C for 2 days 
(right), followed by exposure to a 12 h:12 h 20°C:29°C (left) or 16 °C:25°C 
(right) temperature cycles in LL, which after 6-7 days was delayed or 
advanced by 6h, respectively. Warmer temperature indicated by red and 


> 


LETTER 


Canton S IR25a~” nocte' 


Canton S IR25a~’ rescue 


orange shading, respectively. b, Actograms and daily averages of Canton S 
and IR25a~/~, and IR25a~'~ flies containing a genomic IR25a rescue 
construct (rescue) exposed to 18°C:20°C temperature cycles in LL (left) 
and 21 °C:23°C temperature cycles in LL (right). Warm phase in actograms 
indicated by orange shading. Histogram colour coding as in Fig. 2. 

For quantification see Fig. 2e. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Canton S IR25a” Canton S IR25a’ 
LD Ae" eee. ee LD _Aa 
20°C se 
DD25 °C ‘ 
DD+TC 
(7p) a3 (7p) a 
. © € # 
Cc ie ce x 
oO — oO = 
DD oO [S) 
20°C DE” iia 
Canton S id 
a. Oa, 
= ee 
IR25a’ rescue 5-c| a 
LD mm ae: ae 6h ll (ra Gane 
° " . aa r. a 
sali zs saw Una on em 25 °C Pesce gece 


Zeitgeber /circadian time(h) 


DD+TC DD+TC 
25:27 °C 21:23 °C) 
IR25a“ y=21 
[Cr CD aay 
a OE ee ewe 
ant 25 °C | bane anal oll 
iy : rt - mod. aol, awl 
er | eet 
DD penrtmasnaet, DD Ce 
°, 1 H | °, 
25°C a 25°C pp | Snes 
va) 25 °C | wna 
n=16 ee 
Zeitgeber /circadian time(h) 
Extended Data Figure 4 | IR25a is not required for temperature indicated in the bars. b, As in a but flies were initially kept in LD and 
synchronization to high- but to low-amplitude temperature cycles DD for 2 days each (left) or LD (right), before being exposed to two 
and IR25a~'~ flies show normal LD and DD behaviour. a, Canton $ phase delayed (left) or advanced (right) temperature cycles in DD at the 
and IR25a~'~ flies were exposed to LL at 25°C for 2-3 days, followed by temperatures indicated. n.s., not significant. c, Behaviour of IR25a~/~ and 
exposure to high-amplitude 12 h:12h temperature cycles in LL, which rescue flies during DD and 25°C:27 °C temperature cycles with 8h delay 
after 5-6 days were delayed by 6h. Double plotted average actograms and during DD and 21 °C:23 °C temperature cycles with a 8h advance 
depicting the daily activity levels and environmental conditions during compared to the previous LD cycle (at 25°C). Warm phase is indicated 
the entire experiment are shown. Actual temperatures are colour coded by orange shading. d, Canton S and IR25a/~ flies during LD and DD 
and indicated below the entrainment index calculations. Numbers (1) conditions at 25°C (see Extended Data Table 2 for period calculations). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


rescue rescue rescue 


all 


24 and 3 antennal 3 antennal 24 and 3 antennal : normal antennae 
segments ablated segment ablated segments ablated 


normal antennae 


IR25a’ 
2"¢ and 3" antennal 
segment ablated 


RK 


normal antennae 


3 antennal 


5 
rescue | segment ablated 2 
2”4 and 3 antennal 
segment ablated 
Extended Data Figure 5 | Antennal IR25a expression is not necessary averages as described before. b, Quantification of behaviour as described 
for synchronization of locomotor activity rhythms to temperature in Fig. 2. The data of IR25a~'~ with normal antennae was taken from 


cycles. Ablation of antennae as indicated. a, [R25a~'~ and rescue flies Fig. 2a. n.s., not significant. 
were exposed to the same condition used in Fig. 2. Actograms and daily 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


IR25a> 
a 
Control IR25a-R1 IR25a-R2 IR25a-R1;R2 
qe 
ee 
ae, 
eee 
SeSa2 
ah thy i 
i 
b 
F> nompC> 


IR25a-R1;R2 IR25a-R1;R2 


ES /R25a-R1 
Cc KEKK KEK Sigs oo IR25a-R2 
EE] /R25a-R1;R2 
1.0 28 15 C+ 
0.8 
a 
0.6 
0.4 
elav> IR25a> F> nompC> nocte> ppk>  clk856> ~ trpA1> 
Extended Data Figure 6 | Knocking down IR25a expression via RNA for Fig. 2a. c, Progeny of the respective UAS-IR25a-RNAi lines crossed to 
interference disrupts synchronization of locomotor activity rhythms y w (left three columns) and flies from (a, b) and the other gal4 drivers 
to temperature cycles (25 °C:27 °C in LL). a, b, Behaviour of flies with indicated, were exposed to the same LL and temperature cycle conditions 
spatially restricted IR25a knockdown mediated by IR25a-gal4 (a), used in Fig. 2a. As controls, UAS-dicer, gal4 driver lines were crossed to 
ChO specific F-gal4, and nompC-gal4 (b) driven IR25-RNAi expression, y wand FI males containing UAS-dicer/Y and the respective gal4/+ were 
respectively. Control flies are UAS-dicer2/Y; IR25-gal4/+ (a), and tested. Numbers of analysed individuals (n) are indicated above each 
UAS-dicer2/Y;+/+; F-gal4/+ or UAS-dicer2/Y;+/+; nompC-gal4/+ (b). column. Entrainment was quantified as in Fig. 2c. ****P < 0.0001, 
Test flies carry the same transgenes, but in addition one or two copies of one-way ANOVA followed by Bonferroni correction. 


the IR25a-RNAi line indicated. Actograms and daily averages as described 


© 2015 Macmillan Publishers Limited. All rights reserved 


DNs 


LNvs 


o 


0.8- 08 


06. 06 
o4- 04 


0.2. 02 


Relative TIM level 


0.0 T 


DN3 LNd 


Relative TIM levels in LD at 25 °C 


DN1 DN2 DN3 


Relative TIM levels 


» ss © 


ZT(h) 


© 


ZT(h) 


» > © © 


ZT(h) 


Extended Data Figure 7 | Rescue of TIM oscillations in clock neurons 
during low-amplitude temperature cycles and normal TIM oscillations 
during LD and high-amplitude temperature cycles. a, b, TIM levels in 
clock neurons during LL 25 °C:27 °C temperature cycles at the indicated 
time points (ZT) in the genotypes indicated. At least 8 brain hemispheres 
per time point were analysed for each genotype. Scale bars, 10 jm. 


IR25a~ 


DN3 


Zh) 


» 


LETTER 


rescue 


Z74 


-—# Cantons 
ind ae me IR25a* 
a 
0.8: 0.8- 
-™ rescue 
06: 0.6: 
0.4: 04 
02 0.2. 
0.0. 0.0 
” » © © ” ® © © ” 
ZT(h) ZT(h| 


GHB Cantons 


s-LNv GD R252 


I-LNv 


LPN 


=—g Canton S 


HF IR25a~% 


LNd s-LNv I-LNv 


. 0.01—, 
a 


0.0 + + r 
s 2 


ZT(h) 


4 
s ¢ o 


Zh) 


Data in b are mean +s.e.m. c, Quantification of TIM levels in clock 
neurons during LD (25°C) in Canton S$ and IR25a~/~ mutant brains. 
d, TIM oscillations in different clock-neuronal groups in IR25a~/~ 
are restored in 25 °C:29°C temperature cycles in LL. At least 8 brain 
hemispheres per time point were analysed for each genotype and 
condition. Error bars indicate s.e.m. 


> 
ey 


Zh) 


r 
© 


* 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


= 


Spike frequency (Hz) 


18 20 22 24 26 28 
Temperature (°C) 


C -40,18°C 30°C 18°C 30°C 18°C 30°C d 25 
kK 


-45. 
S50 G15 
iS y 

i= 
£ -55: wl 


-60 
eslh2J 11] (7) 
wt TrpA1 IR8a 


Extended Data Figure 8 | Ectopic expression and heat responses of TRPA1 
and IR8a in 1-LNv clock neurons. a, Whole-cell current clamp recordings 
of Pdf-gal4/UAS-TrpA1; Pdf-RFP (top trace, red) and Pdf-gal4/UAS-IR8a- 
RFP (bottom trace, black) brains exposed to a temperature ramp from 
18°C to 30°C and back to 18°C. Note the additional depolarization of the 
TrpA1 neuron at higher temperatures. b, Compared to control (Fig. 4), 
recordings from TrpA 1-expressing neurons show a large increase in firing 


OE@OEO 
wt TrpA1 IR8a 


Pdf > TrpA1 


o | eee a 
k- 20 


Pdf > IR8a 


5 min 


Pdf > TrpA1 (n= 5) 
Pdf > IR8a (n = 7) 


30 28 26 24 22 20 18 

e 3 us hg kk 

4 

= 10 

8 S 

> | 

= 5 

LW 
GEGCEORMIOCHORO 
wt TrpA1 IR8a wt TrpA1 IR8a 


rate with temperature which the IR8a expressing neurons do not. 

c, d, In comparison to control neurons (data taken from Fig. 4) the 
membrane potential of TrpA1 expressing neurons is more positive at 30°C 
(open bars) and the input resistance is also significantly reduced in TrpA1 
at 18°C. e, f, The firing rate at 18°C is higher for IR8a neurons but only the 
Q10 of TrpA1 is different to control. Bars are means and whiskers s.e.m., 

n indicated in bars, *P < 0.05, ***P < 0.001, ANOVA followed by Tukey test. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Mass spectrometry data from fly heads from three different genotypes 


UNIQUE PEPTIDES % SEQ. MASCOT EMPAI PEPTIDE SEQUENCES 
COVERAGE SCORE SCORE 
FLYBASE ID GENE DESCRIPTION FSNS FSNH NEG FSNS FSNH NEG FSNS_FSNH FSNS FSNH FSNS FSNH 
FBpp0298823nocte no circadian 22 17° 410.87 8.79 1.99 1399.5 713 0.30 0.20LSASTTSWQR,LGYEEYGK,GASVGGS LGYEEYGK,GASVGGSSGYGR, 
temperature SGYGR,SISGGYVQR,QPVGTGSAGGS QPVGTGSAGGSGSGGSGR,QA 
entrainment, GSGGSGR,QDDIDFTK,QAQALPR,GYA QALPR,GYAGSSGGSSVGSGS 
isoform C GSSGGSSVGSGSSYR,GGVSGGGGGV SYR,RQPVGTGSAGGSGSGGS 


SGGAQANAGQGR,RQPVGTGSAGGS GR,GGVSGGGGGVSGGAQAN 

GSGGSGR,NASDWGSSR,TLTTDPMPT AGQGR, TLTTDPMPTGILR, TSE 
QILR, TSESETDLDKTK,QQQQQQQQLP SETDLDKTK,QQQQQQQQLPR, 
R,SPLSADMSLGLAK,KLQELEMK,SAS SPLSADMSLGLAK,KLQELEMK 
ASSAFDSNSR,EQAAAAAVAAQR,LGF ,SASASSAFDSNSR,EQAAAAA 
SFGDDPTTPLK,FTALDINR,KIESCAVV VAAQR,LGFSFGDDPTTPLK,FT 


GGEK,SASPAVVGSGSFR ALDINR,SASPAVVGSGSFR 
FBpp0079064 ninaCc neither inactivation 20 8 2 15.66 5.8 1.27 447 102.5 0.25 0.09AILMLVNAGTPVNNDSTR,MYPEDLAAL LPFDEFLR,TALDNLLTKPDGLF 
nor afterpotentialC, ENPVDENIIESLR,AMFQIIR,LCDFGLSR, YIIDDASR,TLYKEPELFVDR,SDI 
isoform B TLYKEPELFVDR,SSLDESIMLMFTNQLT AEMLELSR,QYTTEEAR,SCQD 


K,DAVASTLYSR,YQFLAFDFDEPVEMT QDLIMDR,LVDFIINR,AAIELNR 
K,LPFDEFLR,TALDNLLTKPDGLFYIIDD 
ASR, YAEVENTDIVSR, YY NDEFLAR,MG 
ESDNIYNQGYFR,SDIAEMLELSR,AFTDI 
NR,QYTTEEAR,ADLEYKPR,SCQDQDLI 
MDR,LVDFIINR,AAIELNR, 

FBpp0072672 alpha-Spec alpha spectrin, 46 7 221.61 3.73 0.95 13175 81.5 0.36 0.04VSTLGAEAQR,LLDSYDLQR,QNQINSQ QEAFLANEDLGDSLDSVEALIK, 

isoformA YDNLLALAR, DLIGVQNLIK,FATDDSYLD LLNVISSGENMLK,ILETVEDIQE 

PTNLNGK,QEAFLANEDLGDSLDSVEAL R,ALNQAWAELK, YAALAAPMG 
IK,_LMDVSNLGVPEIEQR,VTEVNQLADK, ER,IQTQMQDLNEK,LNEACQQ 
DLTGVQNLK,CNSIEEIR,DQPFASDDIR, QQFNR 
DLASVQALQR,QQETPVVDITGK,LLAM 
QEQFR,LNEACQQQQFNR,FIESGHFDA 
DNIR,DADETVAWIAEK,QGFVPAAYIK, 
MQEIVVLWETLVQASDK, |QSVLAMGGN 
LIDK,RAALQEK,LLNVISSGENMLK,FDD 
FNDDLK,QAEIANYWQSLTTK, YAALAAP. 
MGER,DVVLSSDDYGR,DVAGAEALLER 
,>MQEIVVLWETLVQASDKK,LLVGSDDY 
GR,LGDEQTLQQFSR, ILYEQCMDLQLF 
YR,LQAASEESYRDPTNLQAK,DLEDEA 
AWIR,QLLEDSNR,EKEPIAASTNR,QLD 
ETANR.ALDIFATK,ETENVQSYEEIENAF 
R,AlISADELAK ALAALDQK,|ILETVEDIQ 
ER,ALNQAWAELK,NKEGNLSAR,|QTQ 
MQDLNEK,LIDGQHYAADDVAQR,DADE 


IENWIAEK 
FBp| 2 1 0 0.14 AQSDSTAVAASR 
LR, EDVGRDEA 
MLFDANR,MLDTMTPGK 
DNR, AQSDSTAVAA 
ELAEEAER,LKQET 
R,EDNFGAC! 
16 5 0 0 712 44 0.14 ;DIQTA 
FBpp0072127Ca-P60A calcium ATPase at 13 2 015.49 2.55 0 411.33 345 0.21 0.03VIVITGDNK,EVFDSIVR,TGTLTTNQMSV TGTLTTNQMSVSR,NILFSGTNV 
60A, isoform H SR,YGPNELPTEEGK,EFTLEFSR,LNSF AAGK 
SVNK,FSIPVVLLDETLK,SAAEMVLADD 
NFSSIVSAVEEGR, TVEQSLNFFGTDPE 
R,EFDDLSPTEQK,VGEATETALIVLAEK, 
18 0 014.12 0 0 249 0.41 
MENQNAE! 
FBpp0312078 sesB stress-sensitive B, 12 7 23746 22.07 5.69 202.5 62 1.24 0O.50GTGGAFVLVLYDEIK,YFPTQALNFAFK, YFPTQALNFAFK,EFTGLGNCLT 
isoform B EFTGLGNCLTK,EQGFSSFWR,GMVDC K,EQGFSSFWR,TAVAPIER,GA 


FIR,GMLPDPK, TAVAPIER, QVFLGGVDK FSNILR,QEGTGAFFK,SDGIVGL. 
»ATEVIYK,GAFSNILR,QEGTGAFFK,SD YR 


FBpp008848 1 Syn synapsin, i 1F 9 ie) 0 ie) 0 0.32 
8 121.07 266 2.66 32 V 
LK,FDMNS 
VNR,DVDFSVLTK,VILADNSTIPK E 
SGILPAQIFDGFPR 
70 (2)01289 6 fo} ) te) 0 0.09 B LFE 
G LNELENIDDELEKE V 
R EGNLEDEEK,|PALYEGDLMNE 
DEVLE DEDDVIEDVTSK 
FBpp0081031Dap160 dynamin associated 6 2 0 62 1.64 te) 162 24 0.14 0.03DTSMSEMSQLK,AELSALITK,KEDINTN SGYLTGSQAR,LLQLTQER 
protein 160, isoform DVQMSELK,ALQPQAGFVTGAQAK,LLQ 
A LTQER, YTQVFNANDR 
FBpp0293349 CG43078 CG43078, isoformB 5 2 1316 1.17 O07 80.5 89 0.07 0.02VVMETIDDDEFFLR,QYISEAIR,GISEDNI QYISEAIR,AAELDDNEDVGPR 
QLR,QFAEFDEENR,AELDDNEDVGPR 
FBpp0085357 Ars2 Ars2, isoform D 4 2 0 551 2.23 te) 137 72 0.13 0.06VAIADPLVER,FVQANTQELAK.VTNNDV VDSSQADALIR,VAIADPLVER 
FBpp0290 4 1 0 463 099 0 92 23 
3 io} 0) o} 0 45 
FBpp0307 1 0 0.64 0 52 QLEKLK QLEKLK 


gmr-gal4 > UAS-Flag-Strep-Nocte-Strep (FSNS), gmr-gal4 > UAS-Flag-Strep-Nocte-HA (FSNH) and control gmr-ga/4 (driver only, NEG). Data show the numbers of unique peptides, the % protein sequence 
coverage, Mascot Scores (Matrix Science) and EMPAI (empirical abundance index) scores and the peptides sequences derived from Mascot search engine. Data were compared using Protein Centre 
(Thermo). Black entries are high confidence hits and grey entries are lower confidence based on prior knowledge of known contaminants*“. nocte mutants show defects in ChO morphology, pointing 
to a structural role of NOCTE in ChO cilia?. Consequently, the majority of the identified proteins (10/16) likely regulate function and dynamics of the ChO neuron cilia. As we were mainly interested in 
identifying potential temperature receptors, we focused on other NOCTE-interacting proteins, particularly on lonotropic Receptor 25a (IR25a). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 2 | Rhythm analysis of control and /R25a mutant flies under free running (DD) conditions at different ambient 


temperatures 

Genotype °c n %Rhythmic Period(hr) +SEM 
Canton S 18 32 84 23.7+ 0.2 
IR25a~ 18 32 75 24.1+0.3 
rescue 18 28 60 23.6+0.1 
per 18 22 82 26.8+0.1 
Canton-S 25 45 96 24.2+0.1 
IR25a” 25 44 75 24.0+0.1 
rescue 25 30 97 23.6+0.1 
pert 25 26 38 28.4+0.2 
Canton S 29 60 100 23.4+0.1 
IR25a” 29 39 77 23.4+0.1 
rescue 29 27 100 23.8+0.1 
per 29 54 52 29.6+0.4 


Period values were calculated using autocorrelation as described in*°. Flies with a rhythm statistics (RS) value >1.5 were considered rhythmic*°. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15516 


Fungal pathogen uses sex pheromone receptor for 
chemotropic sensing of host plant signals 


David Turra!, Mennat El Ghalid!, Federico Rossi! & Antonio Di Pietro! 


For more than a century, fungal pathogens and symbionts have 
been known to orient hyphal growth towards chemical stimuli 
from the host plant’”. However, the nature of the plant signals 
as well as the mechanisms underlying the chemotropic response 
have remained elusive*. Here we show that directed growth of the 
soil-inhabiting plant pathogen Fusarium oxysporum towards 
the roots of the host tomato (Solanum lycopersicum) is triggered 
by the catalytic activity of secreted class III peroxidases, a family 
of haem-containing enzymes present in all land plants*. The 
chemotropic response requires conserved elements of the fungal cell 
integrity mitogen-activated protein kinase (MAPK) cascade® and the 
seven-pass transmembrane protein Ste2, a functional homologue of 
the Saccharomyces cerevisiae sex pheromone a receptor®. We further 
show that directed hyphal growth of F. oxysporum towards nutrient 
sources such as sugars and amino acids is governed by a functionally 
distinct MAPK cascade. These results reveal a potentially conserved 
chemotropic mechanism in root-colonizing fungi, and suggest a new 
function for the fungal pheromone-sensing machinery in locating 
plant hosts in a complex environment such as the soil. 
Root-colonizing fungi have a dramatic impact on plant health’. 
Beneficial symbionts such as mycorrhiza promote plant growth by 
supplying nutrients and microelements, while soil-borne patho- 
gens provoke devastating yield losses and are highly persistent and 
difficult to control. F oxysporum causes vascular wilt disease in over 
100 field and greenhouse crops®. Infectious hyphae penetrate the roots 


a b 


FITC Merge 


> mM Ww 
oO oo eo 2 


* 
| | | 
. = f 1 = i 1 


preferentially through natural openings at the junctions of epidermal 
cells®, indicating that the fungus can sense and grow towards chem- 
ical signals from the host plant. To learn more about the underlying 
mechanism, we developed a quantitative chemotropism assay on 
agar plates (Extended Data Fig. la—c). Microconidia of F. oxysporum 
exposed to a gradient of glutamate (Glu) produced significantly more 
germ tubes pointing towards the nutrient source than towards the 
solvent control, resulting in positive chemotropism (Extended Data 
Fig. 1d and Fig. 1a). Two hours exposure time was sufficient to induce 
a chemotropic response, indicating rapid reorientation of the hyphal 
growth axis towards the new gradient. Different nitrogen and carbon 
sources such as Glu, Asp or glucose elicited a chemotropic response, 
while others such as Gln, Met, ammonium or galactose did not 
(Fig. 1b). Importantly, germ tubes growing towards the chemoattract- 
ant did not differ in length from those growing towards the solvent, 
ruling out a bias from growth speed (Extended Data Fig. le). Thus, 
E oxysporum responds rapidly and specifically to nutrients by redirect- 
ing hyphal growth towards the chemoattractant gradient. 

In the model fungi S. cerevisiae and Neurospora crassa, chemotropism 
towards a mating partner is mediated by opposite gradients of diffus- 
ible peptide sex pheromones®'™"!, Although no sexual cycle has yet 
been reported in F oxysporum, its genome encodes a putative protein 
with the characteristic hallmarks of fungal a-pheromone precursors, 
containing ten a-pheromone decapeptide repeats with near-identical 
sequence (Extended Data Fig. 1f, g). Synthetic a-pheromone from either 


* 
[ 
T T [J 1 


Chemotropic index (%) 


H,O MeOH Glu Glin 


Asp Met NH, Gluc Gal Glyc Cel Pec 


of 


& 

x 30; 

oO 

xe} 

= 901 

2 

Qa 

& 10; 

3 

5 0 ' 

re) S.c. 
a-pher 


"100 °C Trp | PK + “Ala, 2 ce 7 


ni °C PMSF 


Figure 1 | E oxysporum exhibits chemotropic growth towards different 
compounds. a, Germination of microconidia over time. Germ tube 
emergence sites are visualized by Triticum vulgaris lectin-fluorescein 
isothiocyanate (FITC) staining. DIC, differential interference contrast. 
Scale bar, 5 um. b, Directed growth of germ tubes after 13h exposure to a 
gradient of the indicated componds. Gluc, glucose; Gal, galactose; Glyc, 
glycerol; Cel, cellulose; Pec, pectin (versus solvent control, *P< 0.0001). 


F. o. a-pher 


c, Directed growth towards a gradient of synthetic a-pheromone (a-pher) 
of S. cerevisiae (S. c.) or of E oxysporum (E o.), either untreated (C), 

boiled (100°C) or treated with trypsin (Trp), proteinase K (PK), boiled 

PK (PK 100°C) or PK plus its inhibitor phenylmethanesulfonylfluoride 
(PK+ PMSF); or of a-pheromone analogues (D-Alaj,2) or (D-Alag,7) (versus 
untreated, *P<0.0001). b, c, Data are presented as the mean from two 
experiments. n= 500 germ tubes. Error bars show standard deviation (s.d.). 


1Departamento de Genética, Campus de Excelencia Internacional Agroalimentario ceiA3, Universidad de Cordoba, 14071 Cérdoba, Spain. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 521 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


b e 
20, ,, 
g 
x 
1} 
xe} 
£ 
Oo 
6 10 
2 
Ke} 
€ 
oO 
— 
(6) 
0 
H,O TR RE 
f g9 = h 
& 
Ps @ 40 25, 
& 20 Tomato root HRP s & 
x exudate f 30 x 20 
2 = se 
= 16 = = 8 
@ € 20 rot 
g 10 3 £ 10 
e a 2 
Bos ot . =. g 10 : 5 P ‘ 
) % i Ao} 5 a 
0 T <= T ia T oo T T Hi T T a 1 $ Or T T fy Ot T T 1 
° ° © » Vv oO ? vx 
C 100°C SHAM Asc C 100°C SHAM Asc a & mS ww ew ¥ ww 
x ce 
g g 
gh gh 
~~ ~~ 


Figure 2 | Secreted tomato root peroxidases elicit fungal chemotropism. 
a, Directed growth of F. oxysporum towards tomato roots (TR) or root 
exudate (RE) (versus H20, * P< 0.0001). b, c, Secretion of peroxidase 
activity (b) and its spatial distribution (c) in tomato roots was visualized 
by staining with 2,2'-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) 
(ABTS) plus H2O>. Scale bars, 1 cm (b) and 1mm (c). d, e, Detail of the 
root section marked in c, showing colonization by F. oxysporum expressing 
green fluorescent protein (GFP). Scale bar, 250 um. Experiments were 
performed four times with similar results. f, Peroxidase enzymatic activity 
is required for chemoattraction. Directed growth of germ tubes towards 
root exudate or a gradient of 4 uM HRP, either untreated (C), boiled 


E oxysporum or S. cerevisiae elicited a robust chemotropic response that 
was largely abolished by protease treatment or alanine substitution of two 
conserved residues (Gly¢ and Glnz), indicating its specificity (Fig. 1c). 

We next asked whether F. oxysporum exhibits directed growth 
towards the host plant. Both tomato roots and root exudate induced 
a significant chemotropic response (Fig. 2a). Chemoattractant activ- 
ity of root exudate was sensitive to proteinase K treatment, parti- 
tioned into the water phase after ethyl acetate extraction and residing 
predominantly in the molecular weight fraction between 30 and 
50 kilodaltons (kDa), suggesting that it originates from one or several 
proteins (Extended Data Fig. 2a). Separation by anion exchange chro- 
matography (Extended Data Fig. 2b, c) and SDS—polyacrylamide gel 
electrophoresis (SDS-PAGE) identified two protein bands in the chem- 
otropically active fractions that were absent from the inactive ones and 
elicited a significant chemotropic response (Extended Data Fig. 2d, e). 
Analysis by in-gel tryptic digestion followed by liquid chromatography- 
electrospray ionization-tandem mass spectrometry (LC-ESI-MC/MC) 
identified three tomato proteins, TMP1, TMP2 and CEVI-1 (Extended 
Data Fig. 3a). They belong to class III peroxidases, secreted haem- 
containing oxidoreductases present in all land plants, which catalyse 
the reductive cleavage of hydrogen peroxide by an electron donor”, 
are encoded by multigene families, and function in diverse physiolog- 
ical processes such as cell wall modification and pathogen defence’. 
Arabidopsis thaliana has over 70 members of this family, many of which 
are expressed and secreted in roots". 

We observed a strong gradient of peroxidase activity exuded by 
tomato roots into the adjacent medium (Fig. 2b). The highest enzy- 
matic activity was associated with the root hair zone, which also exhib- 
its the highest density of colonization by F oxysporum (Fig. 2c-e). 
Secreted peroxidase activity differed considerably between root exudates 


522 | NATURE | VOL 527 | 26 NOVEMBER 2015 


(100°C), or in the presence of the peroxidase inhibitor salicylhydroxamic 
acid (SHAM) or the oxygen radical scavenger (+)-sodium L-ascorbate 
(Asc) (versus untreated, *P< 0.0001). g, h, Chemoattractant activity of 
heterologously expressed tomato peroxidase requires an intact catalytic site. 
g, Enzymatic activity of 56 nM recombinant tomato peroxidases CEVI-1, 
TMP2 and TMP2(R38S,H42E), indicated as units ml~!. Data are presented 
as the mean from two experiments, each with two technical replicates 
(versus TMP2, *P<0.005). h, Directed growth towards a gradient of 

169 nM recombinant CEVI-1, TMP2 or TMP2(R38S,H42E) (versus TMP2, 
*P<0.0001). a, f, h, Data are presented as the mean from two experiments. 
n=500 germ tubes. Error bars show s.d. 


collected from different tomato plants, reflecting the multiplicity and 
complex regulation of plant class III peroxidase genes*. Importantly, 
enzymatic activity in individual root exudates correlated significantly 
with fungal chemoattraction (Extended Data Fig. 2f, g). We next tested 
horseradish peroxidase (HRP), which shares 39%, 37% and 49% amino 
acid identity with tomato TMP1, TMP2 and CEVI-1, respectively 
(Extended Data Fig. 3b), and whose molecular structure and cata- 
lytic properties are well characterized'”. Commercial HRP triggered a 
robust chemotropic response in FE. oxysporum (Fig. 2f). Chemotropism 
induced by HRP and tomato root exudate was abolished when peroxi- 
dase enzymatic activity was eliminated by boiling or by addition of the 
specific inhibitor salicylhydroxamic acid (SHAM), or in the presence 
of the oxygen radical scavenger ascorbate (Extended Data Fig. 2h and 
Fig. 2f). However, these inhibitors did not prevent chemotropism 
towards glucose or a-pheromone (Extended Data Fig. 2i). 

To confirm the importance of peroxidase catalytic activity in the 
process of chemoattraction, we heterologously expressed tomato perox- 
idases CEVI-1 and TMP2, as well as a point-mutated version of TMP2 
in which the conserved Arg 38 and His 42 residues!” were substituted 
by Ser and Glu, respectively (Extended Data Fig. 3b). Recombinant 
catalytically active CEVI-1 and TMP2, but not catalytically inactive 
TMP2(R38S,H42E) triggered a robust chemotropic response in F. 
oxysporum (Fig. 2g, h). Collectively, these findings demonstrate that 
secreted root peroxidases elicit a chemotropic response in E oxysporum 
through a mechanism that requires enzymatic activity. 

We next asked whether directed hyphal growth towards different che- 
moattractants is mediated by common or distinct cellular mechanisms. 
Both a-pheromone and Glu exhibited a bell-shaped dose-response 
curve with a gradual decrease at higher concentrations (Fig. 3a), 
which could be explained by receptor saturation, as previously shown 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


TtmkIA 


T ste12A 
Figure 3 | Chemotropism towards nutrients and a-pheromone is 
governed by distinct MAPK cascades. a, Dose-response curves for 
chemotropism towards a-pheromone (a-pher) or Glu. b, Elements of 
the E oxysporum Fmk1 and Mpkl MAPK cascades. ¢, d, Directed growth 


for the pheromone response in S. cerevisiae!'. Remarkably, chemo- 
tropic sensitivity of F oxysporum to a-pheromone was three orders of 
magnitude higher compared to Glu, suggesting that the two responses 
may be governed by distinct cellular mechanisms. To test this idea, 
we used fungal mutants lacking defined elements of MAPK cascades, 
three-component signalling modules conserved from yeast to humans 
that function in succession to transmit a variety of cellular signals". 
Like most fungi, E oxysporum has three MAPKs orthologous to S. 
cerevisiae Kss1, Mpk1 and Hog] (ref. 15). The p42/44 MAPK Fmkl1, 
its upstream MAPKK Ste7 and MAPKKK Ste11, as well as the down- 
stream transcription factor Ste12 (Fig. 3b), function in a conserved 
pathway that governs filamentation in S. cerevisiae and invasive growth 
in plant pathogens'*"!”. Isogenic E oxysporum mutants lacking fmk1, 
ste7, ste11 or ste12 were impaired in chemotropism towards Glu or 
glucose, but not a-pheromone (Fig. 3c and Extended Data Fig. 4). 
S. cerevisiae mutants lacking the orthologous MAPKs Fus3 or Kss1 also 
maintain most of the chemotropic response towards a-pheromone®. To 
investigate whether another MAPK mediates chemotropic sensing of 
a-pheromone, we used the chemical inhibitors PD98059 and SB202190, 
which selectively block p42/44 MAPKs and p38 MAPKs, respec- 
tively. PD98059, but not $B202190, prevented chemotropism towards 
a-pheromone (Extended Data Fig. 5a), pointing to a role of the second 
p42/44 MAPK, Mpkl. In S. cerevisiae, Mpk1 together with Mkk1/2 and 
Bck1 functions in the cell wall integrity (CWI) MAPK module’, which 
is activated during the pheromone response!*. Mutations in CWI path- 
way components have pleiotropic effects on cell wall architecture and 
impair fungal virulence on plants*!>?. Loss of the orthologous genes 
mpk1, mkk2 or bck1 in FE. oxysporum (Fig. 3b) led to high sensitivity 
against the cell-wall-perturbing compounds Congo red and Calcofluor 
white, confirming that they function in the CWI response (Extended 
Data Fig. 5b-g). Interestingly, the mutants were impaired in chem- 
otropism towards a-pheromone but remained responsive to Glu or 
glucose (Fig. 3c and Extended Data Fig. 5h, i). Moreover, a mutant in 


mpk1A 


a 20, b i y 
& LJ o-pher Hi Gu 
5 Cor & 
S 
a] <2 
9 ¥ 
ay 
be ah <lee 
2 J 
E : <> 
ea see ED 
0 Ld 
0 0.01 0.1 i 10 100 1000 Y Y 
c 
& 30 |_| Glu |_| Gluc a-pher 
x 
S 
£ 20 = 
2 
jou 
£ 10 
xe) 
5 ** a * 4 a * 
= 
S) 0+ T T T T T T | 
WT fmk1A fmk1A ste12A ste12A mpk1A mpk1A rho1A rho1A fmk1A 
+ fmk1 + ste12 +mpk1 + rhot mpk1A 
d 
gS Tomato roots Root exudate B HRP ae 
& 20 a T 
no} — 
& 
2 +] L 
S$ * 
£ 10 y ‘ 
fo} * * 
3 inn fit. hi 
c x 
(Ss) 0+ 
WT 


T mpkiA T T T 


+ mpk1 


Arho1A Arho1A 


+rho1 


fmk1A 

mpk1A 

of indicated fungal strains towards a gradient of Glu, glucose (Gluc) or 
a-pheromone (c), or tomato roots, root exudate or HRP (d) (versus wild 
type (WT), *P< 0.0001). a, c, d, Data are presented as the mean from two 
experiments. n= 500 germ tubes. Error bars show s.d. 


the small G protein Rhol (ref. 19), located upstream of the CWI MAPK 
module’, also failed to grow towards a-pheromone. In S. cerevisiae, 
Rhol mediates localization of key components of the CWI pathway to 
the tips of pheromone-induced mating projections”. 

We sought to confirm these results independently using a differ- 
ent chemotropism assay based on the angle of hyphal tip projections 
relative to the chemoattractant gradient. This method was used 
extensively in S. cerevisiae to study chemotropic responses to mating 
pheromones®!®!!, The average cosine of hyphal tip projection angles 
was significantly higher when EF oxysporum was exposed to a gradient 
of Glu or a-pheromone compared with the water control, confirming 
positive chemotropism (Extended Data Fig. 6a, b). The fmk1A mutant 
was specifically impaired in growth towards Glu while the mpk1A 
mutant failed to respond to a-pheromone. Taken together, these results 
establish that the invasive growth and CWI MAPK pathways have dis- 
tinct and complementary roles in chemotropic sensing of nutrients and 
sex pheromones. Consistent with this model, a fmk1A mpk1A double 
mutant responded neither to nutrients nor to a-pheromone (Fig. 3c). 

We next tested whether these MAPK cascades are required for 
chemotropic growth of E oxysporum towards the host plant. Mutants 
lacking Fmk1, Ste7, Stel1 or Ste12 were not affected in chemotropism 
towards tomato roots or root exudate, but those lacking Mpk1, Mkk2, 
Bck1 or Rhol were impaired (Fig. 3d and Extended Data Figs 4g, 5i). 
Importantly, CWI components were also essential for the chemotropic 
response to HRP while Fmk1 and Ste12 were not (Fig. 3d and Extended 
Data Fig. 5i). Thus, peroxidase-mediated chemotropism of F. oxysporum 
towards plant roots is specifically governed by the CWI MAPK cascade. 

In S. cerevisiae and N. crassa, chemotropic sensing of a-pheromone 
requires the seven-pass transmembrane (7TM) G-protein-coupled 
receptor (GPCR) Ste2 or Pre-2, respectively®!°. E oxysporum has a 
putative Ste2 orthologue with seven predicted transmembrane regions 
and a topology characteristic of ascomycete a-pheromone receptors 
(Extended Data Fig. 7a, b). Loss of Ste2 (Extended Data Fig. 7c, d) 


26 NOVEMBER 2015 | VOL 527 | NATURE | 523 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b Figure 4 | The 7TM domain receptor Ste2 
= = is required for chemotropic sensing of 
& 30 HBcu Glue o-pher xs TR MHRP a-pheromone and root peroxidases. a-e, Directed 
& BS 20 growth of the indicated fungal strains exposed to 
£ 20 = 7 2 a a gradient of Glu, glucose (Gluc) or a-pheromone 
2 ti g (a-pher) (a); tomato roots (TR), root exudate 
£ 10 g 19 . (RE) or HRP (b); or exposed simultaneously to 
E ‘ +" 2 = am competing gradients of Glu opposed to either 
7 zs 
= Ot 1 1 oS @ of 1 1 1 a-pheromone (c), root exudate (d) or HRP 
S) WT ste2A ste2A fmk1A oO WT ste2A ste2A fmk1A ild * 
+ ste2 ste2A 4 sioD ste2A (e) (versus wild type, *P< 0.0001). Data are 
presented as the mean from two experiments. 
© d © n=500 germ tubes. Error bars show s.d. 
p WT 2 WT WT 
S 3 
E S o " 
) ste2A +|© x ste2A * |Q oc ste2A 2 
o ec ® 3 r= c 
a re} 
o ste2A g ste2A ste2A 
3 + ste2 + ste2 + ste2 
-30 -20 -10 0 10 20 30 -30 20-10 0 10 20 30 -30 -20 10 0 10 20 30 


abolished chemotropism of F oxysporum towards a-pheromone but not 
towards nutrients, confirming that it is a functional homologue of yeast 
Ste2 (Fig. 4a). Strikingly, ste2A mutants were impaired in chemotropism 
towards tomato roots, root exudate and HRP (Fig. 4b). This was unex- 
pected because Ste2 is generally regarded as a specific receptor for 
a-pheromone”". To corroborate further the role of Ste2 in chemotropic 
sensing of root chemoattractants, fungal strains were exposed to oppo- 
site gradients of Glu versus either a-pheromone, root exudate or HRP. 
The wild-type and the complemented strain failed to display directed 
growth, suggesting that chemotropism is annulled in the presence of 
two competing gradients (Fig. 4c—e), while ste2A grew towards Glu, con- 
firming its incapacity to sense a competing gradient of a-pheromone, 
root exudate or HRP. Loss of Ste2 caused a small but significant 
decrease in virulence of F. oxysporum on tomato plants, indicating 
that Ste2-mediated chemotropic sensing of secreted root compounds 
is important for initiation of fungal infection (Extended Data Fig. 7e). 
Interestingly, in a previous study Petunia plants defective in secretion 
of the signalling compound strigolactone were significantly delayed in 
the symbiotic interaction with arbuscular mycorrhizal fungi”. 

Our results reveal a previously unknown ability of the fungal path- 
ogen E oxysporum to reorient hyphal growth towards a variety of 
chemical signals. On the basis of genetic evidence, we propose that 
chemotropism is mediated by distinct MAPK modules: Fmk1 for nutri- 
ents and Mpk1 for sex pheromones and plant compounds (Extended 
Data Fig. 8). Remarkably, F oxysporum uses the same signalling pathway 
for chemotropic sensing of mating factors and host cues, including Ste2, 
a7TM GPCR that was previously thought to function specifically in a- 
pheromone sensing. How secreted plant peroxidases generate a chemoat- 
tractant signal, and how Ste2 mediates signal sensing and chemotropic 
response in concert with the CWI pathway, remain to be determined. 
Since class III peroxidases and fungal MAPK cascades are evolutionarily 
conserved*">, our findings might be of general relevance to the chemo- 
tropic interaction between plants and root-colonizing fungi. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 17 June 2014; accepted 21 August 2015. 
Published online 26 October 2015. 


1. de Bary, A. Vergleichende Morphologie und Biologie der Pilze, Mycetozoen, und 
Bacterien (Wilhelm Engelmann, 1884). 

2. Zentmyer, G. A. Chemotaxis of zoospores for root exudates. Science 133, 
1595-1596 (1961). 

3. Brand, A. & Gow, N. A. Mechanisms of hypha orientation of fungi. Curr. Opin. 
Microbiol. 12, 350-357 (2009). 

4. Passardi, F., Penel, C. & Dunand, C. Performing the paradoxical: how plant 
peroxidases modify the cell wall. Trends Plant Sci. 9, 534-540 (2004). 

5. Levin, D. E. Cell wall integrity signaling in Saccharomyces cerevisiae. Microbiol. 
Mol. Biol. Rev. 69, 262-291 (2005). 


524 | NATURE | VOL 527 | 26 NOVEMBER 2015 


6. Arkowitz, R. A. Chemical gradients and chemotropism in yeast. Cold Spring 
Harb. Perspect. Biol. 1, a001958 (2009). 

7. Berendsen, R. L., Pieterse, C. M. & Bakker, P. A. The rhizosphere microbiome 

and plant health. Trends Plant Sci. 17, 478-486 (2012). 

8. Dean, R. et al. The Top 10 fungal pathogens in molecular plant pathology. Mol. 

Plant Pathol. 13, 414-430 (2012). 

Pérez-Nadales, E. & Di Pietro, A. The membrane mucin Msb2 regulates 

invasive growth and plant infection in Fusarium oxysporum. Plant Cell 23, 

1171-1185 (2011). 

10. Kim, H. & Borkovich, K. A.A pheromone receptor gene, pre-1, is essential 

or mating type-specific directional growth and fusion of trichogynes and 

emale fertility in Neurospora crassa. Mol. Microbiol. 52, 1781-1798 (2004). 

11. Segall, J. E. Polarization of yeast cells in spatial gradients of alpha mating 

actor. Proc. Nat! Acad. Sci. USA 90, 8332-8336 (1993). 

12. Berglund, G. |. et a/. The catalytic pathway of horseradish peroxidase at high 

resolution. Nature 417, 463-468 (2002). 

Badri, D. V. et a/. Root secreted metabolites and proteins are invo 
early events of plant-plant recognition prior to competition. PLoS 
e46640 (2012). 

4. Widmann, C., Gibson, S., Jarpe, M. B. & Johnson, G. L. Mitogen-activated 
protein kinase: conservation of a three-kinase module from yeast to human. 
Physiol. Rev. 79, 143-180 (1999). 

5. Turra, D., Segorbe, D. & Di Pietro, A. Protein kinases in plant-pathogenic fungi: 
conserved regulators of infection. Annu. Rev. Phytopathol. 52, 267-288 (2014). 

16. Di Pietro, A., Garcia-MacEira, F. |., Méglecz, E. & Roncero, M. |. A MAP kinase of 

he vascular wilt fungus Fusarium oxysporum is essential for root penetration 

and pathogenesis. Mol. Microbiol. 39, 1140-1152 (2001). 

Liu, H., Styles, C. A. & Fink, G. R. Elements of the yeast pheromone response 

pathway required for filamentous growth of diploids. Science 262, 

1741-1744 (1993). 

Buehrer, B. M. & Errede, B. Coordination of the mating and cell integrity 

mitogen-activated protein kinase pathways in Saccharomyces cerevisiae. Mol. 

Cell. Biol. 17, 6517-6525 (1997). 

artinez-Rocha, A. L. et al. Rhol has distinct functions in morphogenesis, cell 

wall biosynthesis and virulence of Fusarium oxysporum. Cell. Microbiol. 10, 

1339-1351 (2008). 

Bar, E. E., Ellicott, A. T. & Stone, D. E. GBy recruits Rhol to the site of polarized 

growth during mating in budding yeast. J. Biol. Chem. 278, 21798-21804 (2003). 

. Xue, C., Hsueh, Y. P. & Heitman, J. Magnificent seven: roles of G protein-coupled 

receptors in extracellular sensing in fungi. FEMS Microbiol. Rev. 32, 1010-1032 

(2008). 

Kretzschmar, T. et al. A petunia ABC protein controls strigolactone-dependent 

symbiotic signalling and branching. Nature 483, 341-344 (2012). 


w 


ved in the 
ONE 7, 


st 


go 


yo 


20. 


22. 


Acknowledgements The authors are grateful to E. Martinez Aguilera for 
technical assistance. This work was supported by grants BIO2010-15505 
and BIO2013-47870-R from the Spanish Ministerio de Innovaci6n y 
Competitividad (MINECO), and BlIO2008-04479 from MINECO/ERA-NET 
PathoGenoMics to A.D.P. M.E.G. was supported by the Marie Curie ITN 
ARIADNE (FP7-PEOPLE-ITN-237936) from the European Commission. F.R. 
had a fellowship from the ERASMUS student exchange program. 


Author Contributions D.T. and A.D.P. initiated the work and designed the 
experiments. D.T., M.E.G. and F.R. carried out the experiments and analysed the 
data. D.T. and A.D.P. wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
A.D.P. (ge2dipia@uco.es). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Fungal strain culture and transformation. Fungal strains used in this study 
are listed in Extended Data Table 1. All are derivatives of F. oxysporum f. sp. 
lycopersici isolate 4287 (FGSC 9935). Strain culture and storage were per- 
formed as described!°. Phenotypic analysis of colony growth and invasion of 
cellophane membranes was done as reported”’, Targeted gene replacement with 
the hygromycin resistance cassette and complementation of the mutants by co- 
transformation with the phleomycin resistance cassette were performed as 
reported’. Oligonucleotides used to generate PCR fragments for gene replace- 
ment, mutant identification and complementation are listed in Extended Data 
Table 2. F oxysporum gene data are available in the Fusarium Comparative 
Database at the Broad Institute under the following accession numbers: ste11, 
FOXG_09411; ste7, FOXG_05521; fmk1, FOXG_08140; ste12, FOXG_02103; ste2, 
FOXG_10633; rhol, FOXG_13835; bck1, FOXG_08078; mkk2, FOXG_02117; 
mpk1, FOXG_05092. 

Quantification of fungal chemotropism. Freshly obtained microconidia were 
embedded in 4 ml water agar (WA; 0.5%, w/v) (Oxoid) at a final concentration of 
2.5.x 10° per ml and poured into a standard Petri dish (Extended Data Fig. 2a). A 
central scoring line was drawn on the bottom of the plate, and two parallel wells 
were cut into the WA layer on both sides at 5 mm distance from the scoring line. 
Then, 50 ul of the test compound solution or the solvent control were added to 
the wells at both sides of the scoring line. In gradient competition experiments, 
solutions of the two different test compounds were applied at both sides of the 
scoring line. Tested compounds and standard concentrations were: sodium glu- 
tamate (Glu), glutamine (Gln), sodium aspartate (Asp), methionine (Met), all 
at 295 mM; ammonium nitrate (NH4,), glucose (Gluc), galactose (Gal), glycerol 
(Glyc), all at 50 mM; or cellulose (Cel), pectin (Pec), all at 1% (w/v). Sterile water 
or methanol were used as solvent controls. To measure chemotropism towards 
tomato plants, the root of a 2-week-old tomato seedling was placed directly on 
top of one of the wells. A sterile metal string was placed on the opposite well 
as a control. Plates were maintained in a plastic box at 28°C in the dark for the 
indicated time periods (13h unless otherwise stated). Chemotropism of coni- 
dial germ tubes was quantified with an Olympus binocular microscope (200 x 
magnification), by counting the number of hyphal tips pointing towards the test 
compound and those pointing towards the solvent control. The chemotropic 
index was calculated as ((Htest — Hsolv)/ total X 100), where Hest is the number of 
hyphae growing towards the test compound, Holy is the number of hyphae grow- 
ing towards the solvent control, and Hota is the total number of hyphae counted. 
For each test compound a total of 500 hyphal tips were scored. All experi- 
ments were performed at least twice. Statistical analysis was conducted using 
t-test. 

For the hyphal tip projection assay, light microscopy photographs of chem- 
otropism plate assays were recorded in a Leica DMR microscope (200 x magnifi- 
cation) using a Leica DFC 300 FX digital camera, and the angle (in degrees) of the 
hyphal tip relative to the chemoattractant gradient was measured using the Image] 
software‘, For each test compound a total of 300 hyphal tip projection angles 
were measured. All experiments were performed at least twice. Length of germ 
tubes growing towards the test compound or the solvent control was measured 
using ImageJ. For visual monitoring of compound diffusion, 50 ul of a 1% (w/v) 
solution of Congo red in water was added to the test compound well and 50 ul of 
water into the solvent control well. Plates were incubated at 28 °C, and dye diffusion 
was documented after different time periods in a Leica MZ FLIII fluorescence 
stereomicroscope using a Leica DFC 300 FX digital camera. Dye intensity was 
quantified with the KODAK 1D Image Analysis software. 

The MAPK inhibitors PD98059 and $B202190 (Calbiochem) were added to the 
WA medium at a final concentration of 10 or 30 uM, respectively, before adding the 
chemoattractant compound. Commercial horseradish peroxidase (HRP; Sigma) 
was assayed at a standard concentration of 4 1M. Peroxidase inhibitors/scavengers 
salicylhydroxamic acid (SH), thiourea (TU) and (+)-sodium L-ascorbate 
(Asc) (Sigma) were added directly to the chemoattractant solution at a final con- 
centration of 60mM, 60 mM and 160 mM, respectively. Synthetic E oxysporum 
a-pheromone, its analogues (p-Ala;,2) and (p-Alag,7), and S. cerevisiae a-factor 
were obtained from GenScript (Piscataway). Lyophilized peptides were dis- 
solved in 50% (v/v) methanol in water and assayed at a standard concentration of 
378 uM. To test the effect of different treatments on the chemoattractant activity of 
FE. oxysporum a-pheromone, the peptide was incubated for 10 min at 100°C, or 
for 30 min at 37°C with 1 mg ml trypsin, 1 mg ml“! proteinase K, 1 mg ml“! 
heat-denatured (10 min at 100°C) proteinase K, or 1 mg ml! proteinase K plus 
1mM phenylmethylsulfonyl fluoride (PMSF) (all from Sigma). 

Light and fluorescence microscopy. Low-resolution imaging was performed 
using a Lumar.V12 fluorescence stereomicroscope (Zeiss). Wide-field fluo- 
rescence imaging was performed using a Zeiss Axio Imager M2 microscope 


LETTER 


equipped with a Photometrics Evolve EMCCD camera. To visualize growth of 
F. oxysporum on tomato roots, freshly obtained microconidia of the wild-type- 
GFP strain!® were embedded in WA as described earlier. One-day-old germi- 
nated tomato seeds were placed on top of the medium and incubated for 2 days 
at 28°C before observation in the microscope. 

To visualize sites of germ tube emergence, F. oxysporum microconidia were 

stained with 50 ug ml! fluorescein isothiocyanate-labelled lectin from Triticum 
vulgaris (WGA-FITC) (Sigma) in PBS containing 12.5 4M CaCl and 12.5 uM 
MnCl. 
Identification of the root chemoattractant. Tomato seeds (cultivar Monica) 
were provided as a gift by Syngenta Seeds. Seeds were surface sterilized”, planted 
in moist vermiculite and maintained in a growth chamber (15/9 h light/dark 
photoperiod, 28°C) until the plants reached the second true leaf stage. Roots 
were washed carefully to remove the adhering substrate, placed in sterile water 
and kept at 25°C for 48h. The collected root exudate was filtered through a 
0.22-um Millipore membrane and stored at — 20°C until use. To measure fresh 
root weight, roots were cut from individual plants, gently blotted with a paper 
towel and weighed. 

Filter-sterilized root exudates were partitioned with ethyl acetate to obtain an 
ethyl acetate fraction and a water fraction (WF). The WF was further separated 
by centrifugal ultrafiltration (MWCO 10 kDa; 30 kDa; 50 kDa) (Corning). The 
30-50 kDa fraction was applied to a Hitrap QFF anion exchange chromatography 
column on an AKTA purifier (GE Healthcare), and proteins were eluted with a 
linear gradient of NaCl. Fractions were desalted by dialysis, tested for chemo- 
tropic activity as described earlier and analysed by SDS—polyacrylamide (10%) gel 
electrophoresis, followed by Coomassie blue staining. Proteins bands were eluted 
from the gel and tested for chemoattractant activity. Proteins of interest were sub- 
jected to tryptic digestion and analysed by liquid chromatography-electrospray 
ionization-mass spectrometry (LC-ESI-MS). Identification of tomato proteins 
was carried out at the Protein Micro-Analysis Core Facility of the Biozentrum, 
Innsbruck Medical University. 

Heterologous expression of tomato peroxidases. Recombinant tomato peroxidase 
proteins were produced in Escherichia coli strain BL21 (DE3) ung-151 transformed 
with plasmids pET28a-Cevi-1, pET28a-Tmp2 and pET28a-Tmp2(R38A,H42A), 
respectively. Plasmids were obtained by subcloning the corresponding com- 
plementary DNA fragments lacking the sequence encoding the signal peptide 
in the vector pET28a(+), using XhoI and Ndel. Solubilization and re-folding 
of recombinant peroxidases was performed as described”. Purification of the 
recombinant proteins was performed on an AKTA purifier using a Ni-NTA 
chromatography column. 

Peroxidase enzymatic activity assays. To visualize peroxidase activity secreted 
by roots, 4-day-old tomato seedlings were placed on 0.5% WA supplemented with 
0.91 mM 2,2/-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid (ABTS) (Sigma) 
and 2.5mM H,0, (J. T. Baker) (WA-ABTS), and incubated for 45 min at 28°C. 
Seedlings were carefully removed from WA-ABTS, placed in a Petri dish containing 
0.5% agarose (w/v) and imaged with a stereo microscope. 

Peroxidase activity assays were carried out in 96-well microtitre plates. The reac- 
tion mixture contained 0.91 mM ABTS, 2.5mM HQ) in phosphate-citrate buffer 
(51mM Naj,HPO,. 24mM citric acid, pH 5.6) in a final volume of 150 ul. Where 
applicable, peroxidase inhibitors/scavengers (75 uM TU and SH or 250M Asc) 
were pre-incubated with the buffer for 5 min before adding ABTS and H2O3. For 
each reaction, a blank containing heat-inactivated (20 min boiling) peroxidase was 
included. Reactions were incubated at 28°C and absorbance at 405 nm was meas- 
ured at different time intervals in a Spectrafluor Plus microplate reader (Tecan). 
Peroxidase activity was calculated in units per ml, using the formula ((AA405 nm/ 
min test — A.A4o5 m/min blank) x (total volume assay) x (dilution factor))/ 
((millimolar extinction coefficient of oxidized ABTS at 405 nm) x (volume enzyme 
used)). Statistical analysis was conducted using t-tests. 

Tomato seedling infection assay. Surface-sterilized tomato seeds (cultivar 
Monica) were transferred to sterile glass tubes containing WA or WA supplemented 
with 2.5 x 10° microconidia per ml of the different E oxysporum strains. Plants 
were maintained in a growth chamber (15/9h light/dark cycle, 28°C). Survival 
was recorded daily, calculated by the Kaplan-Meier method and compared among 
groups using the log-rank test. Virulence experiments were conducted with 40 
plants per treatment and performed twice with similar results. 

Bioinformatic and statistical analysis. F oxysporum predicted proteins Ste11, 
Ste7, Bck1, Mkk2, Mpk1 and Ste2 were identified by BLASTp search of the 
Fusarium Comparative Database at the Broad Institute (http://www.broad 
institute.org/annotation/genome/fusarium_group/MultiHome.html), using the 
amino acid sequences of the S. cerevisiae proteins. Protein alignments and phy- 
logenetic comparisons were done using ClustalW (ref. 26) and MEGAS (ref. 27). 
Protein domain predictions were made using the Prosite database (ExPASy; Swiss 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Institute of Bioinformatics). Prediction of Ste2 transmembrane helices was done 
with SOSUI’®. Linear regression analysis was conducted using MedCalc v. 12.1.0 
(MedCalc Software). 

Data reporting. No statistical methods were used to predetermine sample size. The 
experiments were not randomized. The investigators were not blinded to allocation 
during experiments and outcome assessment. 


23: 


24. 


Lépez-Berges, M. S., Rispail, N., Prados-Rosales, R. C. & Di Pietro, A. 

A nitrogen response pathway regulates virulence functions in Fusarium 
oxysporum via the protein kinase TOR and the bZIP protein MeaB. Plant Cell 
22, 2459-2475 (2010). 

Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years 
of image analysis. Nature Methods 9, 671-675 (2012). 


25. 


26. 


27. 


28. 


29. 


Smith, A. T., Sanders, S. A., Thorneley, R. N., Burke, J. F. & Bray, R. R. 
Characterisation of a haem active-site mutant of horseradish peroxidase, 
Phe41—Val, with altered reactivity towards hydrogen peroxide and reducing 
substrates. Eur. J. Biochem. 207, 507-519 (1992). 

Larkin, M. A. et a/. Clustal W and Clustal X version 2.0. Bioinformatics 23, 
2947-2948 (2007). 

Tamura, K. et al. MEGA5: molecular evolutionary genetics analysis using 
maximum likelihood, evolutionary distance, and maximum parsimony 
methods. Mol. Biol. Evol. 28, 2731-2739 (2011). 

Hirokawa, T., Boon-Chieng, S. & Mitaku, S. SOSUI: classification and 
secondary structure prediction system for membrane proteins. Bioinformatics 
14, 378-379 (1998). 

Rispail, N., & Di Pietro, A. Fusarium oxysporum Ste12 controls invasive growth 
and virulence downstream of the Fmk1 MAPK cascade. Mol! Plant Microbe 
Interact 22, 830-839 (2009). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
1 ht 
1 t : 
nm Oh y Scoring line 
bs 4 
a, ™ 4 o f 
ra I a > I 
1 Q @ 08 1 
3) i727) 
e | 3 2h 5 
(= zp 2 
2 12 E o 
le} 
a 1 a Ss 
u 2 04 
? 2 
@ 
f 4h @ 02 
1 | ow 
t 0 
Scoring line 0 2 4 6 8 10 
Distance from application line (mm) 
d e 
30 4 30 
= @ Compound 
=< € 25 {| Solvent 
tae =a 
% = 
@ 20 4 S 20 
e P 
° o 
2 3 
2 2 
© 104 = 10 
5 E 
ray Solvent Scoring 3 5 
control line compound 
04 0 
CIE aE Cel Gluc Glu 
f g 
F. oxysporum MKYSEVTLAAVAGAALAAPPPSAIDNFGPDFFTPECN--LDYKGKPCEELVGKGDKNADA 58 
F.verticillicides MKYSFVTLAAVAGAALAAPPPSAIDNEGPDFFTPFCN--LDYKGKPCEELVGKGDKNADA 58 F. oxysporum -\yC---TWREQPCW-- 
F.graminearum MKYSILTLAAVASTILAVAVP-~-~~~~~~~~~~~-~~-==-==-~==~===-===-——— 21 
Sete: ene sr ® F. verticillioides —(yC --- TWREQPCW-- 
F. oxysporum F. graminearum-\C ---TWK{EQPCW-— 
F. ‘ticillioide 
ani M. oryzae Quyc ---PRROPCW-— 
S. sclerotiorum -\JC---GRPI@QPC--- 
F.oxysporum B. cinerea -|JC---GRPEQPC--- 
F.verticillioides 
F.graminearum N. crassa Qijc---RIHEQSCW-- 
S. macrospora QiC---RIHIEQSCW-- 
F.oxysporum A. nidulans -C---RFA@RICPPT 
F.verticillioides oe 
ess C. parasitica —iC---LFH§EGCW-— 
S. cerevisiae —iHWLOLKPIEQPMY-- 
F.oxysporum kk kk 
F.verticillioides 
F.graminearum 
F. oxysporum 
F.verticillioides 
F.graminearum 
F.oxysporum 
F.verticillicides 
F.graminearum 
F. oxysporum 
F.verticillioides 
P.graminearum 
F.oxysporum ‘ATRSLDTRSADAPSTAHLPRD: IVELANLIALSARGSPEEYFKSLELETEFPD 434 
F.varticillioides ‘ATRSLDTRSADTPSTAMLPRD: IVELANLIALSARGSPEEYFKSLELETFFPD 504 
F.graminearum ‘ATRGVETR-~ SVAETEHLPRDAAHQAMSTVELANVIALSARGSPEEYFKULYLEEFEPE 331 
fit et |, f AenRRERE aeenEenE ee eneeanEAEEeAe foe bee; 
F. oxysporum ‘AAPNATAREDINTIOE! \VLDAVDGDDGATGPGGPDSHYDT 494 
F.verticillioides ‘AAPNAT) ct \VLDAVDGDDGATGPGGPDSHYDT 564 
F.graminearum IPHNATAREDVET: c WLHAVDGSDGAGAPGGPEEHEDT 391 
F.oxysporum RDFKSEM! yLEAIKAAARSTADMSEE 524 
F.verticillioides RDFKSEM! yLTAIKAAARSTADMPEE 594 


F.graminearum ‘SHENPONI IMAIKAAARSVVESLEG 421 
GE pbk dubia oaeebe: 2 


Extended Data Figure 1 | Plate assay for quantitative determination of 
directed hyphal growth and identification of a E oxysporum orthologue 
of the S. cerevisiae a-pheromone precursor. a, Schematic representation 
of the plate chemotropism assay. Test compound and solvent control are 
applied to opposite sides of a Petri dish containing a layer of water agar 
with 2.5 x 10°ml~! E oxysporum microconidia, at a distance of 0.5cm 

from the central scoring line. Chemotropic index was calculated as 

((Heest — Hsotv)/ Hiotal X 100), where Hest is the number of hyphae growing 
towards the test compound, Holy is the number of hyphae growing towards 
the solvent control, and Hiotai is the total number of hyphae counted. 

b, Visualization of compound diffusion and gradient establishment. The 
dye Congo red (1% w/v in water) was loaded into the application well on 
the right side of the scoring line. Diffusion was recorded photographically 
after the indicated time intervals. c, Dye intensity in experiment b was 
measured at the indicated distances from the application well after different 
time intervals, using the Kodak Image Analyzer software. The blue dashed 
line represents the relative position of the scoring line. Mean values were 
calculated from measurements of five individual spots per distance. 

d, Direction of germ tube emergence after 2h exposure to a gradient of Glu 
or the solvent (H,O) was quantitatively determined by lectin-FITC staining 
and expressed as chemotropic index (versus H2O, *P< 0.0001). Data are 


presented as the mean from two experiments. n= 200 germ tube emergence 
sites. e, Lengths of germ tubes exposed for 13h to a gradient of 1% (w/v) 
cellulose (Cel), 55 mM glucose (Gluc), 295 mM Glu or the solvent (H20) 
were measured using the Image] software. The mean length of germ tubes 
growing towards the nutrient chemoattractants is not significantly different 
from that of germ tubes growing towards the solvent. Data are presented as 
the mean from two experiments. n= 100 germ tubes. Error bars show s.d. 

f. The predicted product of the F oxysporum a-pheromone precursor gene 
(Fusarium Comparative Database accession FOXG_08636) was aligned with 
predicted a-pheromone precursors from F. graminearum (FGSG_05061) and 
E verticillioides (FVEG_06038). Conserved residues are indicated with an 
asterisk. Predicted KR and RR cleavage signals for KEX2-like endopeptidases 
are highlighted in red. Predicted maturation signals characterized by the 
presence of XA or XP dipeptide repeats are highlighted in yellow. Predicted 
mature a-pheromone decapeptide repeats are highlighted in grey. Coloured 
arrowheads indicate differences between the decapeptide repeats at the 
third amino acid residue. g, Amino acid alignment of predicted mature 
a-pheromone of E oxysporum with orthologues from ascomycete fungi. 
Absolutely and highly conserved residues are shaded in black and in grey, 
respectively. Residues replaced with alanines in the (Ala;,2) or (Alag,7) 
analogues (see Fig. 1c) are indicated with asterisks. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


* 
* 
ae we 
* 
0 ——— Lm, t 
PK WF EAF 


" 
1 
no treat <10 10-30 30-50 >50 


> = ND N 
° a is) a 
1 L L L 


Chemotropic index (%) 


a 
L 


Root exudate 


* 
* * * 
te 
2 
3 
4e 
5e 
0 T 
Fl oF2 FS FA FS 


° 
a 


= 
ro) 


Chemotropic index (%) 
a 


7) 


A280 (gm) 


Time (min) 


Chemotropic index (%) 


B1 B2 B3 B4 BS 


Peroxidase activity 
(10° Units per mg fresh weight) 
= nN wo 
QQ 


15 4 


Mm HRP Root exudate 


10 4 


Peroxidase activity 
(104 Units/ml) 


SHAM Asc 


Extended Data Figure 2 | Purification of chemoattractant compounds 
from tomato root exudate reveals secreted peroxidases. a, Chemotropic 
growth of germ tubes towards a gradient of tomato root exudate (RE) either 
untreated (no treat); treated with 1 mg ml! proteinase K for 30 min at 37°C 
(PK); extracted to obtain an ethyl acetate fraction (EAF) and a water fraction 
(WF); or the WF subjected to centrifugal ultrafiltration with membranes of 
10, 30 or 50 kDa molecular weight cut-off to obtain fractions < 10, 10-30, 
30-50 and >50, respectively (*P=0.006; ** P< 0.0001, versus untreated). 

b, Anion exchange chromatography profile of fraction 30-50 from a. Obtained 
fractions F1-F5 are indicated. c, Directed growth of F. oxysporum germ tubes 
towards fractions F1-F5 from b (* P< 0.0001, versus HO). d, SDS-PAGE of 
biologically active fraction F1 and inactive fraction F5, followed by staining 
with Coomassie blue. Protein bands present in the active and absent from 
the inactive fraction (named B1-B5) are indicated by arrowheads. Relative 
positions of molecular weight markers are indicated on the right. e, Directed 
growth of germ tubes towards the proteins eluted from bands B1-B5. 

f, Peroxidase activity of root exudates obtained from 18 individual tomato 
plants, indicated as units ml“! per mg fresh root weight. Data are presented 


g 
& 
Pd 
oO 
no] 
£ 
2 
a 
g 
re) 
& 
oO 
p= 
(S) 
T 0 T T T T T 
7 18 0 0.5 1.0 1.5 2.0 2.5 
Peroxidase activity (10 Units/ml) 
i 
25 4 
SS 205 = il | 
Hy i | 
2 154 
Oo 
‘a 
£ 10 4 
fo} 
£ 
2 54 
(S) 
0 T T T T 1 
Cc Asc c SH Asc 
Gluc a-pher 


as the mean of three technical replicates. Error bars show s.d. g, Relationship 
between peroxidase enzymatic activity of root exudates and elicited 
chemotropic response. Each empty circle represents a root exudate sample 
from an individual tomato plant (n= 18). Linear regression (solid line) and 
95% mean prediction interval (dashed lines) indicate linear correlation of the 
two variables (P< 0.001). h, Specific inhibitors and oxygen radical scavengers 
abolish peroxidase enzymatic activity. Activity of 2.5nM commercial HRP 

or 100 ul root exudate was measured in the absence (C) or presence of 75 

uM of the specific inhibitors thiourea (TU) or SHAM, or 250 uM of the 
scavenger (+)-sodium L-ascorbate (Asc), and indicated as units ml~!. Data are 
presented as the mean of three experiments, each with two technical replicates 
(*P<0.0002, versus C). i, Peroxidase inhibitors and scavengers do not affect 
chemotropism towards glucose and a-pheromone. Chemotropic growth of 
germ tubes towards a gradient of glucose or a-pheromone, in the absence (C) 
or presence of 60 mM SHAM or 160mM Asc. No significant differences were 
observed between treated and untreated samples. a, c, e, g, i, Data represent 
the mean from two experiments (a, ¢, g, i) or one representative experiment 
performed twice (e). 1=500 germ tubes. Error bars show s.d. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
Protein Protein Peptide Experimental Theoretical 
name band sequence mass (Da)' mass (Da) 
EMVALAGAHTVGFAR 1545,63/1529,55 1528,78 
TMP1 2/3 
AVVDSAIDAETR 1246,51/1246,43 1245,62 
YASSQSQFFDDFASSMIK 2074,83/2058,67 2057,90 
CEVI-1 2/3 LGNIGVLTGTNGEIR 1513,79 1512,83 
DAASNVGAGGFDIVDDIK 1763,79 1762,84 
EMVALAGAHTVGFAR 1545,60 1528,78 
TMP2 2 
LGGQTYTVALGR 1235,40 1234,67 
10 20 


cor ih [ete cashes fll yore sectay Irenttecoey |een tees ei dienes ett Wve dee 
| NiGigRLGHLSLALSFVALALAGVAIYRNTYEAL IMKNGSILIEKNVS} 
VAIYRNTYEAMIMNN BLMONTSH 


TMP1 
TMP2 
CEVI-1 
HRP-C1A 


’ ScgRLGLLSLAVSLVALALA\ 


PRPrRPPR 


TMP1 71 YWROQQLTPE 

TMP2 70  YIWSQQOLTQES 
CEVI-1 23 SATFYAST 
HRP-C1A 31 PTFYDNS 


TMP1 
TMP2 
CEVI-1 
HRP-C1A 


TMP1 
TMP2 
CEVI-1 
HRP-C1A 


TMP1 
TMP2 
CEVI-1 
HRP-C1A 


TMP1 
TMP2 
CEVI-1 
HRP-C1A 


Extended Data Figure 3 | Identification of chemoattractant proteins 
from tomato root exudate. a, Peptide sequences obtained from protein 
bands B2 and B3 after in-gel tryptic digestion followed by LC-ESI-MS/ 

MS. Masses were calculated by using monoisotopic masses of the occurring 
amino acid residues and giving peptide masses as [MH] +. b, Amino acid 
sequence alignment of class III tomato peroxidases TMP1 (P15003), TMP2 
(P15004) and CEVI-1 (Q9LWA2), and HRP isoenzyme C (HRP_C1A) 


140 
| 


NSPP 138 
NSPP 137 
AAS 91 
AFG 99 


325 
SNRLLHDMVEVVDFVSSM 353 


(K7ZWW6). Peptides identified in the chemotropically active fraction of 
tomato root exudate by LC-ESI-MS/MS are underlined in red. Predicted 
signal peptides are indicated by green boxes. Residues conserved in at least 
three of the four proteins are shaded in black. Conserved catalytic residues 
are indicated by orange boxes. Residues Arg 38 and His 42, which were 
replaced by Ser and Glu, respectively, in the catalytically inactive recombinant 
TMP2(R38S,H42E) protein (see Fig. 2g, h) are marked with blue asterisks. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a wt = #40—ts«aHAA (sé b 
M P T P T P T P T 
: nly 
2 ste7A + ste7 d 
M WTste7A #2 #10 #12 #18 
e 


No 
o 
j 


Mi Gluc 


0 | I = I | I mi I | ] 


= = NO 
oO a oO 
1 1 1 


Chemotropic index (%) 
a 


WT ste7A ste11A ste11A 


+ste7 +ste71 


Extended Data Figure 4 | Conserved elements of the invasive growth 
MAPK cascade are required for chemotropism towards glucose. 

a, b, Identification of ste7A (a) and ste11A (b) deletion mutants. Genomic 
DNA of the wild-type strain (WT) and several independent transformants 
was used as a template for polymerase chain reaction (PCR) with the primer 
pairs ste7PF + Hyg-G (P) and ste7TR + Hyg-Y (T) and stel1PF + Hyg-G (P) 
and stell1TR + Hyg-Y (T), respectively. Presence of an amplification 
product is consistent with homologous replacement of the target gene. 

c, d, Identification of complemented strains obtained from ste7A and ste11A 
mutants. Genomic DNA of independent transformants obtained upon 
transformation of the indicated mutants with the wild-type ste7 (c) or ste11 
gene (d) was used as a template for PCR with primer pairs ste7PFN + ste7GR 


ste7A 


wt #2 #4 #5 
Pf Po ee PT 


M “ 
~ 
od 


— -" o- 


ste11A + ste11 


M Wtste11A #2 #16 #19 #21 #32 
aed 
~ Noms! “oma! asa! (omen! Atos! 


before 


ste11A 


a-pher 
HE Root exudate 


No 
oa 
j 


N 
(=) 
\ 
| 
1 


= = 
oO a 
Ll L 


a 
N 


Chemotropic index (%) 


oO 


WT ste7A ste11A 


and stell1PFN + stel1GR, respectively. Presence of an amplification product 
is consistent with integration of an intact gene copy. e, Elements of the 

Fmk1 MAPK pathway are required for invasive growth through cellophane 
membranes. Colonies were grown on PDA plates covered with a cellophane 
membrane for 2 days at 28°C (before). The cellophane with the fungal colony 
was removed and plates were incubated for an additional day (after). The 
experiment was performed twice, each with three plates. Results shown are 
from one representative experiment. Scale bar, 2 cm. f, g, Directed growth 

of germ tubes of the indicated F oxysporum strains towards a gradient of 
glucose (Gluc) (f), a-pheromone or tomato root exudate (g) (versus wild type 
for a given compound, * P< 0.0001). f, g, Data are presented as the mean from 
two experiments. n= 500 germ tubes. Error bars show s.d. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 25 = b 
= al mpk1A mpk1A fmk1A 
| ae #3 #4 #5 #0 #7 WT #1 #2 #3 #4 #5 #6 H7 
E45 | 7 , . gai 
3 2 
2 5.8 Kb ame } . 
£ 10 - -—™ 
5 ; 
= 
is ct 44kb — a ew on 
Untreated PD98059 $B202190 
F. oxysporum a-pheromone 
c #13 #14 #40 WT 
— ———— d WT #12 #22 #31 
M PT P TP T PT 
: M PT P TP T PT 
Net Na Nad a 
=. — ap Gap Seay (ae? es ee? 
e mkk2A + mkk2 f bck1A + bck1 
M Wmkk2A #1 #2 #3 #5 #6 M WTbckiA #3 #6 #9 #13 
. —_ a oe eee 
a 2. om . 
9 YPD Calcofluor white Congo red 
40° 108 10° 105 104 103 10° 10° 10% 10° 
wT ee 
bck1A 
bck1A 
+bck1 
mkk2A 
mkk2A 
+mkk2 
mpk1A 
mpk1A 
+mpk1 
h . i e 
2 Be a-pher HB Root exudate Mi HRP 
i Gluc 
25 4 _ 28 T 
x 20 + % 20 4 Sy 
2 E 3 
4 2 4 
3 15 8 15 
G4 J 2 | 
e 10 E 10 
@ a 
5 ro) 
54 5 4 
ae ae soe * 
0 T T 1 0 T Sl 
WT) mkk2A_ bck1A WT mkk2A mkk2A bck1A bekiA 
+mkk2 +bck1 


Extended Data Figure 5 | Conserved elements of the CWI MAPK cascade 
are required for chemotropism towards a-pheromone, root exudate and 
peroxidase. a, Directed growth of germ tubes of the wild type or the fmk1A 
mutant towards a gradient of a-pheromone, in the absence or presence 

of PD98059 (selective p42/44 (ERK-type) MAPK inhibitor) or $B202190 
(selective p38/Hog1 MAPK inhibitor) (versus wild type, *P< 0.0001). 

b, Identification of mpk1A and fmk1A mpk1A deletion mutants by 

Southern blot analysis. Genomic DNA of the wild-type and 11 independent 
transformants was treated with EcoRI, separated on a 0.7% agarose gel, 
transferred to a nylon membrane and hybridized with a DNA probe 
corresponding to the 3’ flanking region of the mpk1 gene. Transformants #1, 
#4, #7 (wild-type background) and #1, #2, #4, #7 (fmk1A background) show a 
banding pattern consistent with targeted deletion of the mpk1 gene. 

c, d, Identification of mkk2A (c) and bck1A (d) deletion mutants. Genomic 
DNA of independent transformants was used as template for PCR with 

the primer pairs mkk2PF + Hyg-G (P) and mkk2TR + Hyg-Y (T), or 

bck1 PF + Hyg-G (P) and bck1TR + Hyg-Y (T), respectively. Presence of 
an amplification product is consistent with homologous replacement of 


the target gene. e, f, Identification of complemented strains obtained from 
mkk2A and bck1A mutants. Genomic DNA of independent transformants 
obtained after transformation of the indicated mutants with the wild-type 
mkk2 (c) or bck1 allele (d) was used as a template for PCR with the primer 
pairs mkk2PFN + mkk2GR, or bck1PFN + bck1GR, respectively. Presence 
of an amplification product is consistent with integration of an intact gene 
copy. g, Elements of the Mpk1 MAPK pathway are required for the cell 
wall stress response. Colony phenotypes of the indicated strains grown 

on yeast peptone dextrose medium (YPD) in the absence or presence of 
the cell-wall-perturbing compounds Calcofluor white (20 ug ml~!) or 
Congo red (100 ug ml“). Plates were spot-inoculated with the indicated 
amount of microconidia, incubated for 4 days at 28°C and scanned. The 
experiment was performed twice, each with three plates. Results shown are 
from one representative experiment. h, i, Directed growth of germ tubes of 
the indicated F. oxysporum strains towards a gradient of glucose (Gluc) (h), 
a-pheromone, tomato root exudate or HRP (i) (versus wild type for a given 
compound, *P < 0.0001). a, h, i, Data are presented as the mean from two 
experiments. n= 500 germ tubes. Error bars show s.d. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


;-- 


. 
oon 
yet 
Pemweweme fone 


Solvent 


Oo aw 


EE! 


Scoring 
line 


0,25 
0,20 
0,15 
® 
i 
6 0,10 
(2) 
oO 
© 
© 0,05 
> 
{ 
0,00 
-0,05 
-0,10 
HO Glu Pher HO Glu Pher H,O Glu Pher 
WT fmk1A mpk1A 
Extended Data Figure 6 | Hyphal tip projection angle assay reveals or the water control. Data are presented as the mean from three experiments. 
differential roles of Fmk1 and Mpk1 MAPKs in chemotropism n= 100 germ tubes. Bars indicate upper and lower 95% significance limits for 
towards glutamate and a-pheromone. a, Schematic representation of the cosine means according to a t-test. A cosine of 1 means perfect orientation 
chemotropism plate assay based on measurement of hyphal tip projection while 0 means random orientation. Chemotropism was considered 
angles. b, Average cosine of hyphal tip projection angles of the F. oxysporum significant when the lower confidence limit was >0. 


wild-type, fmk1A or mpk1A strains towards a gradient of Glu, a-pheromone 


© 2015 Macmillan Publishers Limited. All rights reserved 


F.oxysporum 
F.verticilloides 
F.graminearum 
Moryzae 
N.crassa 
S.macrospora 
A.nidulans 
S.sclerotiorum 
S.cerevisiae 
Pe | 
0.8 0.6 0.4 0.2 0.0 
c 
ste2A 
WT #1 #2 #3 #4 #8 
4.8 kD am -—_ oe 
3.6kb — 2 - 
o : ~ = 
d 
fmk1A ste2A 
#1 #2 #4 #5 #7 #9 WT 
D 4 
. is 
48kb om _ 


3.6kb aoe 
ms —— 


Extended Data Figure 7 | Loss of Ste2 negatively affects virulence 

of F. oxysporum on tomato plants. a, Phylogram of Ste2 orthologues 

from ascomycete fungi. The analysis was conducted using the MEGA5 
program. Distances were inferred using the unweighted pair group method 
with arithmetic mean (UPGMA). b. Two-dimensional model of the 
transmembrane topology of F. oxysporum Ste2. The model was generated 
using the SOSUI software”*. Amino acid residues in the primary and 
secondary transmembrane helix are indicated in dark and light green, 
respectively. Hydrophobic, positively, and negatively charged residues are 
marked in black, blue, and red, respectively. c, d, Southern blot analysis to 
identify ste2A (c) and fmk1A ste2A (d) deletion mutants. Genomic DNA of 
wild type and the indicated transformants was treated with EcoRI, separated 
on a 0.7% agarose gel, transferred to a nylon membrane and hybridized 


LETTER 


4100 9} $$$ 4-6 eee nt te O 
75 4 oe 
3 | A WT 
e 50 — @ ste2d 
an 
xs J [-] ste2A+ste2 
25 + 
0 ] 
0 40 


Days after inoculation 


with a DNA probe corresponding to the 5’ flanking sequence of the ste2 

gene. Transformants #1 and #8 in c and #4, #5 and #9 in d show a banding 
pattern consistent with targeted deletion of the ste2 gene by homologous 
integration of a single construct. e, Loss of Ste2 negatively affects virulence of 
E oxysporum on tomato seedlings. Surface-sterilized tomato seeds (cultivar 
Monika) were germinated in glass tubes with 4 ml 0.5% water agar containing 
2.5x 10°ml ! microconidia of the indicated F. oxysporum strains and 
incubated at 28 °C under a daily cycle of 15h light and 9h dark. Plant survival 
was recorded for 32 days. Plants inoculated with the ste2A mutant showed 
significantly lower mortality than those inoculated with the wild-type and the 
complemented strain (P= 0.02, log-rank test). n= 40 plants. Results shown are 
from one representative experiment. Experiments were performed twice with 
similar results. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Chemoattractants 
Nutrients a-pheromone plant peroxidases 


@@o 


v 
v 


Directed hyphal growth 


Extended Data Figure 8 | Mechanism of chemotropic signalling in Fusarium oxysporum. Chemotropic sensing of nutrients such as glucose or Glu is 
mediated by the Fmkl MAPK pathway. Chemotropic sensing of a-pheromone and plant peroxidase-derived signals requires the 7TM-domain receptor Ste2 
and the Mpk1 MAPK pathway. Dotted arrows between components denote unknown mechanistic links. 


+ 


© 2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Fusarium oxysporum strains used in this study 


Strain 

FGSC 4287 
4287 GFP 
fmk1A 

fmk1A + frk1 
ste12A 

ste12A + ste12 
motA 

mo1A + thot 
ste2A 

ste2A + ste2 
fmk1A ste2A 
mpk1A 

mpk1A + mpk1 
fmk1A mpk1A 
bck1A 

bck1A + bck 
mkk2A 

mkk2A + mkk2 
ste11A 

ste11A + ste11 
ste7A 

ste7 + ste7A 


Ref. 29 is cited in this Table. 


Gen otype 

wild type 

PgpdA-GFP; H YG 
fmk1::PHLEO 
fmk1::PHLEO; fmk1::HYG 
s@12::HYG 

ste12::HYG; ste12::PHLEO 
mo1::HYG 

mho1::HYG; mo1::PHLEO 
st2::HYG 

St@2::HYG; s2::;PHLEO 
fmk1::PHLEO; ste2::HYG 
mpk1::HYG 

mpk1::HYG; mpk1::PHLEO 
fmk1::PHLEO; mpk1:HYG 
bck1::HYG 

bek1::HYG; bck1::PHLEO 
mkk2::HYG 

mkk2::HYG ;mkk2::PHLEO 
sie11::HYG 

s11::HYG; ste11::PHLEO 
SbB7::HYG 

Ste7::HYG; s7:;PHLEO 


Gene function 


Green F luorescent Protein 
MAPK 
MAPK 


Homeodomain transcription factor 


Homeodomain transcription factar 


Rho-type GTPase 
Rho-+ype GTPase 
GPCR 

GPCR 
MAPK/GPCR 
MAPK 

MAPK 

MAPK / MAPK 
MAPKKK 
MAPKKK 

MAPKK 

MAPKK 

MAPKKK 
MAPKKK 

MAPKK 

MAPKK 


© 2015 Macmillan Publishers Limited. All rights reserved 


Reference 
(ref. 16) 
(ref. 16) 
(ref. 16) 
(ref. 16) 
(ref. 29) 
(ref. 29) 
(ref. 19) 
(ref. 19) 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 
This study 


LETTER 


LETTER 


Extended Data Table 2 | Oligonucleotides used in this study 


Primer Sequence 


gpdA15B 
troter8B 
Hyg-G 
Hyg-Y 


CGAGACCTAATACAGCCCCT 
GGATCCAAACAAGTGTACCTGTGCATTC 
CGTTGCAAGACCTGCCTGAA 
GGATGCCTCCGCTCGAAGTA 


hygB 
cassette 


Ste2PFO 
Ste2PFN 
Ste2PFN2 
Ste2PR 
Ste2PRGPDA15B 
Ste2TR 

Ste2TRN 
Ste2TFTrpter8B 
Ste2PFO2 


GCAGGCACAAAGAACAGCAAT 
GTGGCAGAGGAGAGAGCTATAG 
ATTACACCAGCAGTGTTTGCC 
TAAAGATTGGAAGTGAAAGGGG 


TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)TAAAGATT GGAAGT GAAAGGGG* 


TCAACATCAACAAGCGAAAGAG 
AACTTAGGGGCTCTGAGGATG 


TTTACCCAGAATGCACAGGTACACTTGTTT(A)GACCAAAACAAAACTTCTAGCG* 
ACCTGGATACACGAACGATAC 


ste2 knockout/ 
complementation 


Mpk1TF1 
Mpk1TR1 
Mpk1TR2 
Mpk1PF1 
Mpk1PF2 
Mpk1PR1 
Mpk1PFO 
Mpk1-R 
STE11 PF 
STE11 PFN 


STE11 PR 


STE11 TF 
STE11 TRN 
STE11 TR 
STE11 GR 
STE7 PF 
STE7 PFN 
STE7 PR 
STE7 TF 
STE7 TRN 
STE7 TR 
STE7 GR 
BCK1 PF 
BCK1 PFN 
BCK1 PR 


BCK1 TF 
BCK1 TRN 
BCK1 TR 
BCK1 GR 


TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)CAGTATTTCCCTTCAGCCAAC* 
TCAAGACCAATGTACCTACGG 

CTTTTGGACGAACTGTGAACC 

TGGAGAAGAGTAAATGGACGG 

AGGGAAACGAGGTAGGTTACA 
TTTACCCAGAATGCACAGGTACACTTGTTT(A)CTGGTGATGTGGCTGATTTGT* 
TCCACAGACTACAGAAGAACG 

TCTCCTAGAGGCATCCAGTCC 

TAGGTGATTAGACGTGGGAAG 


GGTCTAGGCTCACTTTGTTTC 
TTTACCCAGAATGCACAGGTACACTTGTTT(A)CATTGTGGGCT GAGAAGGAAC* 


TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)TTGACCACAACCTACGACCTA* 
CTTCTGATATGCCGATGGAAC 

CATCAGTCCTTCTCAATCCAG 

ATGTTCAGGGATTGTAGGAGC 

CCTGAGCCGAGTATGGAATTG 

GTCCCCTTATGGCGAATGAAT 
TTTACCCAGAATGCACAGGTACACTTGTTT(A)GATAGAATTGACAAGCTCGCC* 
TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)TACCGTCTTCAAATCCCAAGG* 
CAGTGGCTTCGTTAATCAGTC 

AATGGAAGAGAGTGGAAGAGG 

TAAATGATCTTCAGGGTTAGGC 

GAGACTTTTGGAATGGAGAGG 

GGTAGATTGAGTTACGTCTGG 
TTTACCCAGAATGCACAGGTACACTTGTTT(A)TCTTGAGGCT GAGATTGAGAC* 


TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)GTCTGGGTTGTGTAGTCCTG* 


TGGTGATGTTCGTCAAGAGATA 
GTTTCCTTGTTGCCTCGATCT 
ATCCAGAATACCGAACCTTGC 


mpk1 knockout/ 
complementation 


ste11 knockout/ 
complementation 


ste7 knockout/ 
complementation 


bck1 knockout/ 
complementation 


MKK2 PF 
MKK2 PFN 
MKK2 PR 


MKkK2 TF 
MKK2 TRN 
MKK2 TR 
MKK2 GR 
Tap2_for1 
Tap1/2_rev 


TAGCTTTGGATTGCGGTTGGA 

ACGAGAATGACGATGTGTGTG 
TTTACCCAGAATGCACAGGTACACTTGTTT(A)GGGTAGTGGAGTT GAATCAGA* 
TGGTCGTTGTAGGGGCTGTATTAGGTCTCG(A)TTTCTTTTGTCTGGGGTT GGG* 
CCGAGAATAGCATCTTCAGAC 

CAGATTGCTCGTTTCCTCAAG 

GCTCGTCTGTTGGGTTGTTTT 

CTCTGTTTCTTCCAAATAGAC 

CTCGAGTCACATAGAAGCCACAGAAG 


mkk2 knockout/ 
complementation 


tmp2 cloning 


Tmp2FPCR_for 
Tmp2FPCR_rev 


Cevi1_for1 
Cevi1_rev 


*The sequence shown in italics corresponds to the complementary region of the godA15B (Ste2PRGPDA15B, Mpk1TF1, STEL1TF, STE7TF, BCK1TF and MKK2TF) or trpter8B (Ste2TFTrpter8B, Mpk1PR1, 


ATTAGTCTACATTTCGAGGACTGC 
GTCCTCGAAATGTAGACTAATGAG 
CATATGCAATTAAGTGCAACATTTTACG 
CTCGAGCTAATCAACTAATTAACCCTCT 


STE11PR, STE7PR, BCK1PR and MKK2PR) primers. 


© 2015 Macmillan Publishers Limited. All rights reserved 


tmp2 
mutagenesis 


cevi-1 cloning 


LETTER 


doi:10.1038/nature16064 


Epithelial-to-mesenchymal transition 
is dispensable for metastasis but induces 
chemoresistance in pancreatic cancer 


Xiaofeng Zheng", Julienne L. Carstens!*, Jiha Kim!, Matthew Scheible', Judith Kaye!, Hikaru Sugimoto!, Chia~Chin Wv’, 


Valerie S. LeBleu! & Raghu Kalluri!*+ 


Diagnosis of pancreatic ductal adenocarcinoma (PDAC) is 
associated with a dismal prognosis despite current best therapies; 
therefore new treatment strategies are urgently required. Numerous 
studies have suggested that epithelial-to-mesenchymal transition 
(EMT) contributes to early-stage dissemination of cancer cells and 
is pivotal for invasion and metastasis of PDAC’*. EMT is associated 
with phenotypic conversion of epithelial cells into mesenchymal- 
like cells in cell culture conditions, although such defined 
mesenchymal conversion (with spindle-shaped morphology) of 
epithelial cells in vivo is rare, with quasi-mesenchymal phenotypes 
occasionally observed in the tumour (partial EMT)*°. Most studies 
exploring the functional role of EMT in tumours have depended 
on cell-culture-induced loss-of-function and gain-of-function 
experiments involving EMT-inducing transcription factors such 
as Twist, Snail and Zeb1 (refs 2,3,7-10). Therefore, the functional 
contribution of EMT to invasion and metastasis remains unclear*®, 
and genetically engineered mouse models to address a causal 
connection are lacking. Here we functionally probe the role of 
EMT in PDAC by generating mouse models of PDAC with deletion 
of Snail or Twist, two key transcription factors responsible for 
EMT. EMT suppression in the primary tumour does not alter the 
emergence of invasive PDAC, systemic dissemination or metastasis. 
Suppression of EMT leads to an increase in cancer cell proliferation 
with enhanced expression of nucleoside transporters in tumours, 
contributing to enhanced sensitivity to gemcitabine treatment 
and increased overall survival of mice. Collectively, our study 
suggests that Snail- or Twist-induced EMT is not rate-limiting 
for invasion and metastasis, but highlights the importance of 
combining EMT inhibition with chemotherapy for the treatment 
of pancreatic cancer. 

We crossed Twist 1!*?°*? (Twist1") or Snai1!*?/°*? (Snail!) 
mice with Pdx1-cre;LSL-Kras©!7;P53®!”24/+ (KPC) to generate 
the Pdx1-cre;LSL-Kras@'7); P538!774+, Twist1/" (KPC;Twist“®°) 
and the Pdx1-cre;LSL-Kras@!7; P538!774+, Snai 1!" (KPC;Snail“*°) 
mice, respectively. The resultant progeny were born in an expected 
Mendelian ratio, without overt phenotypic findings other than the 
anticipated emergence of spontaneous pancreatic cancer (Extended 
Data Fig. 1a). Genetic deletion of Snail or Twist1 did not significantly 
delay pancreatic tumorigenesis, alter tumour histopathology features 
or local invasion (Fig. la-~c and Extended Data Table 1). KPC;Twist"*° 
and KPC;Snail*° mice displayed similar tumour burden compared 
to KPC control mice (Extended Data Fig. 1b) and insignificant differ- 
ences in overall survival (Fig. 1d). Loss of Twist1 or Snail expression 
in the pancreas epithelium was confirmed by in situ hybridization 
coupled with CK8 epithelial immunolabelling (Fig. le and Extended 


Data Fig. 1c) as well as immunolabelling for Twist and Snail (Extended 
Data Fig. 1d). Significant suppression of EMT was noted (Fig. 1f, g 
and Extended Data Fig. le, f). Lineage tracing (Fig. 1f and Extended 
Data Fig. le) and immunolabelling of the primary tumour (Fig. 1g) 
showed a significant decrease in the frequency of epithelial cells with 
expression of the mesenchymal marker aSMA (EMT* cells) and 
a decrease in expression of the EMT-inducing transcription factor 
Zeb1 (Fig. 1h). Global gene expression profiling of tumours revealed a 
decrease in expression of EMT-associated genes (including Snail and 
Twist1) in KPC;Snail®° and KPC;Twist*© mice compared to KPC 
control (Extended Data Fig. 1f). Loss of Snail and Twist enhanced 
E-cadherin expression and suppressed Zeb2 and Sox4 expression 
in cancer cells (Extended Data Fig. 2a—c). Snai2 (Slug) expression 
was restricted to early pancreatic intraepithelial neoplasia (PanIN) 
lesions in all the experimental groups with no observed expression 
in advanced tumours and was significantly reduced in KPC;Snail*° 
and KPC;Twist**° mice compared to KPC control mice (Extended 
Data Fig. 2d). 

While desmoplasia, including extracellular matrix (ECM) and 
myofibroblasts content (Fig. li and Extended Data Fig. 2e, f), tumour 
vessel density (Extended Data Fig. 2g), intratumoural hypoxia 
(Extended Data Fig. 2h), CD3* T-cell infiltration (Extended Data 
Fig. 2i), and cancer cell apoptosis was unaffected with Twist/Snail dele- 
tion in KPC tumours (Fig. 2a), the proliferation of cancer cells in mice 
with suppressed EMT was significantly increased (Fig. 2b), as shown 
previously in mouse models of breast cancers!!~!*. Immunostaining 
experiments further revealed that EMT™ cancer cells are largely 
Ki67~ (Extended Data Fig. 3a). Altogether, these data suggest that 
EMT driven by Twist/Snail transcription factors is dispensable for 
initiation and progression of primary pancreatic cancer. 

Next, we investigated whether suppression of EMT impacts 
invasion and metastasis. The number of YFP* circulating tumour 
cells from lineage-traced KPC and KPC; Twist°*° was found to be 
unchanged (Fig. 2c and Extended Data Fig. 3b), and expression 
of cancer-cell-specific Kras@!2P mRNA in the blood from KPC, 
KPC;Twist**° and KPC;Snail‘*° mice was unaffected (Fig. 2d), sug- 
gesting that suppression of EMT in pancreatic tumours does not 
impact the rate of systemic dissemination of cancer cells. Extensive his- 
topathological analyses, coupled with CK19 or YFP immunostaining of 
distant metastatic target organs, namely the liver, lung and spleen, indi- 
cated a similar frequency of metastasis in EMT-suppressed tumours 
when compared to control tumours (Fig. 2e, Extended Data Fig. 3c 
and Extended Data Tables 1 and 2). The metastases were negative for 
Twist, Snail, Zeb1 and aSMA, with the exception of a few KPC meta- 
static cells that expressed aSMA or Zeb1 (Extended Data Fig. 3d-f), 


1Department of Cancer Biology, Metastasis Research Center, University of Texas MD Anderson Cancer Center, Houston, Texas 77054, USA. Department of Genomic Medicine, University of Texas 
MD Anderson Cancer Center, Houston, Texas 77054, USA. 3Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA. “Department of Bioengineering, 


Rice University, Houston, Texas 77030, USA. 
*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 525 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b NS 
oo ‘. 
2 S400 BB Necrosis ; 
58 HB Sarcomatoid 
; Oo HB Poor 
w ga i Moderat 
4 a a 50 joderate 
a $3 i well 
se [1] PaniN 
Ls 0 [1 Normal 
c 
5 KPC 
fo} + a rf = 
3 2.0 ' T 2 ~~ KPGiTwist*© INS NS 
515 | TL a + KPG;Snailsk 9 
8 = 504 a 
e104) 8 
3 0.5 @ 
| fe} 
0) - : ) ; : : : % 
Cy 0 50 100 150 200 3B 
& Ce Ko S 
& EN Days E 
ws OF 
a 
f YFP/oSMA/DAPI < 
iG. 
jr 
eff 
pa | 
Q 2315 
S gg) ° 
{S) ra) 
G. ct 
x a2 
Oo 
pe g 
& § = 
g oe 5 
Be gh i) 
= > = a 
Ea = 
a7 2 3 
x Bg 60 Bq 60), 
a2 5 2 T 
2 o> 
PEs ORS 
ot hE 
Bot aot 
£25 20 = 
28s 58s 
a5 7 kk Ee” 
oy x 
so BO 
KPC;Twist®*° KPC;SnaileX© 7 7 
2 
a D aK 
o 8 s KEK 
a Ss ie 
2 25 
g Ox 
Oo $2 88% |89% 
a Os 
Ss 53 e 
n eS e 
3 85 —— 
5 o So. os 
a 
2 
8 S14 
3 J eK 
ED 101 © 
20 
SS 65% [92% 
as 
Ng ea 
Os l 
oo " 
z 


Figure 1 | EMT inhibition does not alter primary tumour progression. 
a, Representative haematoxylin and eosin (H&E)-stained primary tumours 
(scale bar, 100 um). b, Relative percentages of each primary tumour 
histological tissue phenotype. n = 31 (KPC), 14 (KPC;Twist*®°) and 

30 (KPC;Snail°®°) mice; error bars represent s.d. c, Local invasiveness 
n=31 (KPC), 14 (KPC;Twist**°) and 30 (KPC;Snail‘®°) mice; error bars 
represent s.d. d, Overall survival n = 29 (KPC), 12 (KPC;Twist*®°) and 

33 (KPC;Snail*°) mice. e, Twist1 or Snail in situ hybridization (black) with 
CK8 (red) immunolabelling in primary tumours (n = 3 mice for all groups; 
scale bar, 50 um). Relative percentages of Twist1*CK8* or Snail*CK8* 
double-positive cells are shown below (two-tailed t-test). 


while being positive for E-cadherin and Ki-67 (Extended Data Fig. 3g, h). 
The proliferation rate of cancer cells in the metastases was simi- 


lar in KPC, KPC;Snail°*° and KPC;Twist“° mice (Extended Data 


526 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Relative ECM deposition 
(MTS) 


f, “SMA immunolabelling in YFP lineage-traced primary tumours (n= 3 
mice for both groups; scale bar, 50 j1m; two-tailed t-test). g, aSMA (red), CK8 
(green) and DAPI (blue) immunolabelling in primary tumours; white arrows 
indicate double-positive cells (n = 4 mice for all groups; scale bar, 20,1m). h, 
Zeb1 immunolabelling (n= 5 (KPC), 6 (KPC;Twist**©) and 6 (KPC;Snail®°) 
mice; scale bar, 501m; inset scale bar, 20j1m). i, Masson's trichrome stain 
(MTS) (n=8 (KPC), 7 (KPC;Twist*®°) and 7 (KPC;Snail“*°) mice; scale 

bar, 100m; error bars represent s.d.). Unless otherwise indicated error bars 
represent s.e.m., percentages represent per cent change from control and 
significance was determined by one-way ANOVA. *P < 0.05, **P< 0.01, 

*** P< 0.001, ****P < 0.0001; NS, not significant. 


Fig. 3h). Collectively, the results indicated that the deletion of Twist1 
or Snail in genetically engineered mouse models of PDAC did not 
reduce metastatic disease. 


© 2015 Macmillan Publishers Limited. All rights reserved 


KPC;Twistck° 


ae 7 
aes a f oy 


° F 
® 
& 
o 
a 
a 
6 
O by 
ae} 
© 
S 
ia 
@ 
oO! 


H&E 


CK19 


KPC KPC;Twist**°  KPC;SnaileK° 
Liver metastasis 11/31 6/14 13/30 
Lung metastasis 11/31 4/14 9/30 
Spleen invasion 2/30 2/14 5/29 
Any metastasis 17/31 8/14 18/30 


No significant differences 
Figure 2 | EMT inhibition does not alter invasion and metastasis. 
a, b, Primary tumour immunolabelling for cleaved caspase-3 (a; n= 6 mice 
for all groups; scale bar, 50 um) and Ki67 (b; n=7 (KPC), 7 (KPC;Twist*®°) 
and 9 (KPC;Snail**°) mice; scale bar, 100 um). c, Percentage of YFP* 
circulating tumour cells (CTCs) (1 = 8 mice for both groups; two-tailed 
t-test; error bars represent s.d.). d, Kras©!” expression in whole blood 
cell pellets (n= 5 (KPC), 3 (KPC;Twist*°) and 5 (KPC;Snail*®°) mice; 
error bars represent s.d.). e, Haematoxylin and eosin staining and CK19 
immunolabelling of metastatic liver nodules. Metastatic tumour nodules (T) 
outlined by a dotted line (scale bar, 100 um). A table presenting the number 
of positive tissues out of total tissues examined is shown below (x? analysis). 
f, Expression analysis of Twist] and Snail in cultured primary tumour 
cell lines (n = 4 (KPC) and 5 (KPC;Twist“®°) individual cell lines (Twist1) 


To evaluate whether cancer cells from the pancreas with and with- 
out EMT program differentially benefited from impaired prolifera- 
tion to form secondary tumours, we isolated cancer cells from KPC, 
KPC; Twist*° and KPC;Snail®° mice to assay their organ colonization 
potential. Twist1 was significantly reduced and Snail expression was 
undetectable in cancer cells isolated from Twist- and Snail-deleted 
tumours, respectively (Fig. 2f). Short-term potential to form tumour 
spheres (associated with putative cancer stem phenotype) appeared 
similar in Twist‘®° and Snail“° KPC cells when compared to control 
KPC cells (Fig. 2g)>*!4"!®, Lung colonization frequencies following 
iv. injection of KPC cancer cells (Twist- or Snail-deleted) were similar 
to the control KPC cancer cells (Fig. 2h). These results suggest that a 
favoured epithelial phenotype of cancer cells (via suppression of EMT) 
did not impact the capacity to form tumour spheres or their ability for 
organ colonization”. 

Cancer cell EMT is associated with gemcitabine drug resistance in 
PDAC patients and in the orthotopic mouse models of PDAC1?*!8-73, 
Moreover, enhanced frequency of EMT* cancer cells in pancreatic 
tumours is associated with poor survival?*”°. To determine whether 
EMT suppression enhances PDAC sensitivity to gemcitabine chemo- 
therapy, we tested the gemcitabine sensitivity of cancer cells with 


LETTER 


x 
8 
se c d 
= nel 
g@ 47Ns Pp 5 8 0.060, NS 
g§ 30 : i‘ $2 0.0554 4 
oS 201° ° & 29 0.050 : 
23 7 43 a © = 0,045 
Sa 10 = ge 
32 i 8 5 = 0.040 
38 OS Ge os & 8S 0.035 He 
35 BAO yey <= Gh AO: 
so © Pe Ee RBP KE 
Es SW we oe 
ey 
oe foo515) ,, © 81S] aie 
38 33 38 
88 281.0 &8 1.0 
53 5% som 
ge BL 05 S8o5 
Pa OB 35 
52 “cs OG 
82 olf & ol tea "5 oh i 
o ie i Go ¢ q 
a € € eo Ke ¥€ € o e Ke 
WwW ot ws ss 
Bright-field YFP 38z 
9 g B8 157 NS 
8 25 10 
$ a2 
2 EG 5 
a 23 
a ee 
3 38° Se 
E = ee ee 
5 eS 
Fad 8 


s 


KPC KPC;Twist*° 


KPC;SnailcK° 


KPC KPC;Twist°X° KPC;Snail*° 
Lung colonization 3/5 4/5 9/11 4/4 4/4 5/5 5/5 
Liver colonization 1/5 0/5 2/11 0/4 0/4 1/5 1/5 
Spleen colonization 0/5 0/5 0/11 0/4 0/4 0/6 O/5 


No significant differences 
or 4 (KPC) and 6 (KPC;Snail*°) individual cell lines (Snai1); one-tailed 
t-test of AC, error bars represent s.d.). g, Bright-field or YFP images and 
quantification of sphere number in cultured tumour cell lines (n = 3 (KPC), 
2 (KPC;Twist*°) and 3 (KPC;Snail**°) individual cell lines; scale bar, 
50m). h, Haematoxylin and eosin images (scale bar, 100 um) of colonized 
lungs from intravenously injected cultured primary tumour cell lines 
KPC (n=5 (cell line 1) and 5 (cell line 2) mice injected) and KPC;Twist°*° 
(n= 11 (cell line 1) and 4 (cell line 2) mice injected) and KPC;Snail*° 
(n=4 (cell line 1), 5 (cell line 2) and 5 (cell line 3) mice injected). A table 
presenting the number of colonized tissues out of total tissues examined is 
shown below (x? analysis). Unless otherwise indicated error bars represent 
s.e.m and significance was determined by one-way ANOVA. * P< 0.05, 
**P< 0.01, ****P < 0,0001; NS, not significant; ND, not detected. 


suppressed EMT in KPC mice. Equilibrative nucleoside transporter 
(ENT1) and concentrating nucleoside transporter (Cnt3) were sig- 
nificantly upregulated in cancer cells lacking Snail and Twist, while 
ENT2 expression was unchanged (Fig. 3a—-c). KPC, KPC;Snail*° 
and KPC;Twist“° mice were treated with gemcitabine and tumour 
burden was monitored by MRI (Extended Data Table 3). Tumour 
progression was suppressed in KPC;Snail° and KPC;Twist*®° mice 
when compared to treated KPC control mice (Fig. 3d). KPC;Snail**° 
and KPC;Twist*° mice treated with gemcitabine showed improved 
histopathology and increased survival (Fig. 3e-g). 

Cancer cells isolated from the tumours of KPC;Snail°®° and 
KPC;Twist*®° mice showed epithelial morphology (Extended 
Data Fig. 4a) and reduced expression of mesenchymal genes 
compared to KPC cancer cell lines (Extended Data Fig. 4b). 
However, in tissue culture conditions (2D culture on plas- 
tic), equilibrative nucleoside transporters (ENT1/ENT2/ 
ENT3) showed similar expression patterns (Extended Data 
Fig. 4b) and expression of concentrating nucleoside transporters 
(Cnt1/Cnt3) was not detected (data not shown). Increased prolifer- 
ation of KPC;Snail°®° and KPC;Twist“*° cancer cells compared to 
KPC control cells (Extended Data Fig. 4c) probably accounted for 


26 NOVEMBER 2015 | VOL 527 | NATURE | 527 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ENT1 


sc 


ENT2 


2x * 
684 
Be ol 
a) ° 
ao oe 
ge 
By? oly or 
s 
=a 
== 
@ 0 
wo Oo 6G: Ge 
¥ BYP WE MS 
e Ee ES 
SX 
ox 
8S 47Ns 
ae 
Bo? é e 
2G ° 
E34 
a2 
Eo 
oO 
wq O 6: lon 
a Kae) 
eS Ee ES 
Rs oe 
aS 


: Cnt3 intensity score 
per visual field (200) 
oo ND 
= | 


d o DayO ee 
@ Day 19 *k Start gem. = + gem. a | 
= = KPC;Twist**° + gem. iad * 

s=~ 2,000 —= KPC:SnailcK : 

3 ] Ns = 100 End point amon 2, ieee 

E 1,6001 0 3 : : 
> 1,000 2 H | End Per cent survival 
E 800 Oo 6 a Died point at end point 
5 = 

S& 600 o e % 5 50 KPC + GEM 10.3 23.1% 

H 
3 400 bd f = | KPC;Twist**° + gem. 6 9 60.0% 
a 
E 200 Se ' KPC;Snail° + gem. 9 1 55.0% 
2 
0 T T or T T T T 1 


Twist*° 
+gem 


Snaile*° 
+gem 


=h 


KPC; TwistK° + gem. 


H&E 


Figure 3 | EMT inhibition sensitizes tumours to gemcitabine in KPC 
mice. a—c, Primary tumour immunolabelling for (ENT1 (a), ENT2 (b) 
and Cnt3 (c) (n= 6 (KPC), 5 (KPC;Twist“*°) and 4 (KPC;Snail*®°) mice; 
scale bar, 100 um; error bars represent s.e.m., two-tailed t-test). d, MRI 
tumour volumes of KPC plus gemcitabine (+ gem.) (n= 13 mice, 10 died 
before day 19), KPC; TwistSK° + gem. (n= 15 mice, 5 died before day 19) 
and KPC; Snail®*° + gem. (n = 20 mice, 9 died before day 19). One-way 


the increased sensitivity to gemcitabine and erlotinib in this setting 
(Extended Data Fig. 4d). 

Next, we crossed the Snail!’ to the PDAC mouse model, Ptfla 
(P48)-cre;LSL-Kras°'”>; Tgfbr2- (KTC) to generate Ptfla (P48)- 
cre;LSL-Kras@!7); Tgfbr2\ "Snail!" (KTC; Snail“®°). The KTC model 
offers a reliable and penetrant disease progression rate with a consist- 
ent timeline of death due to PDAC. Similar to the KPC;Snail“®° mice, 
KTC;Snail“*° deletion exhibited suppression of EMT but did not affect 
primary tumour histopathology, lifespan, local invasion, desmoplasia 
or frequency of apoptosis (Fig. 4f and Extended Data Figs 5a—e and 
6a). KTC;Snail®° mice presented with significantly reduced Zeb1 
expression in cancer cells but enhanced expression of Cnt3, ENT2 and 
proliferation (Extended Data Fig. 5e). ENT1 expression was unchanged 
in KTC;Snail‘®° mice compared to KTC mice (Extended Data Fig. 6a). 
KTC;Snail“° mice demonstrated enhanced response to gemcitabine 
therapy, with significant normal parenchymal area and reduced tumour 
tissue (Fig. 4a-c). Gemcitabine therapy in KTC;Snail®° mice reduced 
tumour burden (Fig. 4d) and significantly improved overall survival 
(Fig. 4e) when compared to gemcitabine-treated control KTC mice. 
Gemcitabine therapy specifically increased cancer cell apoptosis and 
removed enhanced proliferation observed in EMT-suppressed tumours 
(Fig. 4g and Extended Data Fig. 5e), without impacting the desmo- 
plastic reaction (Extended Data Fig. 6b). Overall, these results suggest 


528 | NATURE | VOL 527 | 26 NOVEMBER 2015 


Days g 
oo 
BS 
KPC;Snail*K° + gem. § 2 100 WB Necrosis 
CaN PS be BB Poor 
>. o i Moderate 
ED i well 
so © Panin 
Bcs CO Normal 


ANOVA comparing mean tumour volumes on day 0 and day 19, error 
bars represent s.d. e, Survival on gemcitabine treatment to end point 
(day 21). f, Haematoxylin and eosin-stained primary tumours (scale bar, 
100 um). g, Relative percentages of each histological tissue phenotype of 
end-point mice (n= 3 (KPC + gem.), 9 (KPC; Twist° + gem.) and 

11 (KPC; Snail®®° + gem.) mice; error bars represent s.d.; two-tailed 
t-test). *P < 0.05, **P < 0.01; NS, not significant. 


an enhanced sensitivity of EMT-suppressed cancer cells to gemcitabine. 
Both ENT2 and Cnt3 were upregulated in EMT-suppressed tumours 
(Fig. 4g). These data support a possible mechanistic connection 
between EMT and resistance to chemotherapy in PDAC. 
Collectively, our studies provide a comprehensive functional anal- 
ysis of EMT in PDAC progression and metastasis. Absence of either 
Twist1 or Snail did not alter cancer progression or the capacity for 
local invasion or metastasis to lung and liver in genetically engineered 
mouse models of PDAC. Metastasis occurs despite a significant loss 
of EMT with either the deletion of Snail or Twist, and in both set- 
tings, Zeb1, Sox4, Slug and Zeb2 are also significantly suppressed. 
Nevertheless, it is possible that other EMT-inducing factors may com- 
pensate for the loss of Snail or Twist to induce invasion and metastasis. 
While Pdx1 is expressed during the development of the pancreas (in 
early pancreatic buds and all three major lineages of the pancreas: 
ductal, acinar and (3-islets), its expression is largely repressed in the 
adult exocrine pancreas*®’’. Therefore, deletion of Snail or Twist 
occurs at the embryonic stage and mice are born normal and exhibit 
normal pancreas histology before the onset of cancer. The mice with 
Snail or Twist deletion develop PanIN lesions at the same frequency 
as the control mice. One could argue that suppression of EMT start- 
ing from the inception of cancer could have launched compensatory 
mechanisms to overcome EMT-dependent invasion and metastasis. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


i c 
KTC;Snail°K° + gem. b * 
‘ oars e 37 NS 
2 2 HB Necrosis 8 
85 HB Poor §-2 ee 
Se HB Moderate Q 
83 Bi wei Esliede 
3 Ore. 3 
Gio aniN $ 
Be. [) Normal a D 
& & é 
o” ese 
© Gr x 
Ras 
me 
es 2.0 . 100 
g 2. = = 100 
Qn * o oO 
£157 -—— s + KTC + gem. 2 NS +KTC 
g a — KTC;Snail° + gem. 5 — KTC;Snail 
8 1.0 a a 
8 2 50 2 50 
2 0.5 g 5 
os bad 2 
a o 
0 
gs “0 cd 0 
g s ~ 0 102040 50 60 70 0102040 50 60 70 
ce oF Days Days 
g KTC;Snail**° + gem. 
a 
<x 
a 
Q 
oO 
5 
< 
= 
n 
3 
fa 
® 
7) 
© 
fom 
Q 
© 
6 
so) 
© 
> 
w 
© 
oO 


F 8 
3 2 wee * 
ox 3 o> co fs —— 
as * os e gy 60 °° 
$= ° a26 B= 40 
a3?) 53 8 2304 « 
as LC pe 4 Be 
SS yJJlee ° as gg 20 
2 22 2 a2 
5° Se ae 
520 — 28 6 sep F 8 4 
a fe} ; + 
a ge gg 2 gs £€ 23 ¢ & 
8S &s S S&e "ES &¥ 
x PS x ce oS 2% ww 
on SS on SS 2” SS 
ae a ee i 2s 


Figure 4 | EMT inhibition sensitizes tumours to gemcitabine in KTC 
mice. a, b, Haematoxylin and eosin-stained primary tumour (scale bar, 
100 um) and relative percentage of each histological tissue phenotype 

in KTC + gem. (n=5 ) and KTC;Snail**° + gem. (n= 7) mice (error 
bars represent s.d.). c, Local invasiveness (n= 5 (KTC + gem.) and 

7 (KTC;Snail‘° + gem.) mice; error bars represent s.d.). d, Pancreatic 
mass (n= 3 (KTC + gem.) and 4 (KTC;Snail®° + gem.) mice; error 

bars represent s.d.). e, Overall survival of KTC + gem. (n= 8) and 
KTC;Snail**° + gem. (n= 4) mice. f, Overall survival of KTC (n= 6) and 
KTC;Snail*®° (n = 3) mice. g, aSMA (red), CK8 (green) and DAPI (blue) 


However, such compensation is not observed with respect to chemore- 
sistance, and previous studies have demonstrated that EMT and cancer 
cell dissemination are observed even before PDAC lesions are detected 
in KPC mice’. 

Our study demonstrates that EMT results in suppression of 
cancer cell proliferation and suppression of drug transporter and con- 
centrating proteins, therefore inadvertently protecting EMT* cells 
from anti-proliferative drugs such as gemcitabine. The correlation of 
decreased survival of pancreatic cancer patients with increased EMT is 
probably due to their impaired capacity to respond to gemcitabine and 
chemotherapeutics, which is a standard of care for most patients”*”?. 
A compromised response to chemotherapy probably also explains 
higher metastatic disease in association with decreased survival of 


Per cent Ki67* cells 


25) ns 234 24 he 
20 e 856 al 8a e 
D 228 ° i ae oe = |eee 
= 2 ag 
3 10 0 83° Sa |lye 
$s glint a 21 E34 
5° i is 2F 
20 - & 80 to Bot 
ars yg yg 
S & & &¢ S &e 
S on SS on SS 
Cw» ew * ew * 
< ¢ € 


staining of primary tumours; white arrows indicate double-positive cells 
(n= 4 mice for both groups; scale bar, 20 um), and immunolabelling for 
Zeb1 (n=4(KTC + gem.) and 5 (KTC;Snail*®° + gem.) mice; scale bar, 
50 um; inset scale bar, 20 um), cleaved caspase-3 (n= 4 (KTC + gem.) 

and 5 (KTC;Snail®*° + gem.) mice; scale bar, 501m), Ki67 (n= 4 (KTC + 
gem.) and 5 (KTC;Snail° + gem.) mice; scale bar, 100 um), ENT2 (n=5 
mice for both groups; scale bar, 100j1m), and Cnt3 (n=5 mice for both 
groups; scale bar, 100 1m). Unless otherwise indicated error bars represent 
s.e.m. and significance was determined by two-tailed t-tests. *P < 0.05, 
**P < 0.01, ***P < 0.001; NS, not significant. 


patients with enhanced EMT signatures. Collectively, our study offers 
the opportunity to evaluate the potential of targeting EMT to enhance 
efficacy of chemotherapy and targeted therapies*”. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 10 June; accepted 8 October 2015. 
Published online 11 November 2015. 


1. Hotz, B. et al. Epithelial to mesenchymal transition: expression of the 
regulators snail, slug, and twist in pancreatic cancer. Clin. Cancer Res. 13, 
4769-4776 (2007). 

2. Arumugam, T. et a/. Epithelial to mesenchymal transition contributes to drug 


resistance in pancreatic cancer. Cancer Res. 69, 5820-5828 (2009). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 529 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


10. 


22. 


23. 


Taube, J. H. et al. Core epithelial-to-mesenchymal transition interactome 
gene-expression signature is associated with claudin-low and metaplastic 
breast cancer subtypes. Proc. Natl Acad. Sci. USA 107, 15449-15454 (2010). 
Rhim, A. D. et a/. EMT and dissemination precede pancreatic tumor formation. 
Cell 148, 349-361 (2012). 

Kalluri, R. & Weinberg, R. A. The basics of epithelial-mesenchymal transition. 

J. Clin. Invest. 119, 1420-1428 (2009). 

McDonald, O. G., Maitra, A. & Hruban, R. H. Human correlates of provocative 
questions in pancreatic pathology. Adv. Anat. Pathol. 19, 351-362 (2012). 
Guaita, S. et a/. Snail induction of epithelial to mesenchymal transition in 
tumor cells is accompanied by MUC1 repression and ZEB1 expression. J. Biol. 
Chem. 277, 39209-39216 (2002). 

Wellner, U. et al. The EMT-activator ZEB1 promotes tumorigenicity by repressing 
stemness-inhibiting microRNAs. Nature Cell Biol. 11, 1487-1495 (2009). 
Zhang, K. et al. Knockdown of snail sensitizes pancreatic cancer cells to 
chemotherapeutic agents and irradiation. /nt. J. Mol. Sci. 11, 4891-4904 (2010). 
Tsai, J. H., Donaher, J. L., Murphy, D. A., Chau, S. & Yang, J. Spatiotemporal 
regulation of epithelial-mesenchymal transition is essential for squamous cell 
carcinoma metastasis. Cancer Cell 22, 725-736 (2012). 


. Stockinger, A. Eger, A. Wolf, J., Beug, H. & Foisner, R. E-cadherin regulates cell 


growth by modulating proliferation-dependent 8-catenin transcriptional 
activity. J. Cell Biol. 154, 1185-1196 (2001). 


. Muraoka-Cook, R. S., Dumont, N. & Arteaga, C. L. Dual role of transforming 


growth factor beta in mammary tumorigenesis and metastatic progression. 
Clin. Cancer Res. 11, 937s-943s (2005). 


. Hugo, H. J. et a/. Direct repression of MYB by ZEB1 suppresses proliferation 


and epithelial gene expression during epithelial-to- mesenchymal transition of 
breast cancer cells. Breast Cancer Res. 15, R113 (2013). 


. Mani, S. A. et a/. The epithelial-mesenchymal transition generates cells with 


properties of stem cells. Cel! 133, 704-715 (2008). 


. Liu, H. et al. Cancer stem cells from human breast tumors are involved in 


spontaneous metastases in orthotopic mouse models. Proc. Nat! Acad. Sci. USA 
107, 18115-18120 (2010). 


. Wang, Z. et al. Activated K-Ras and INK4a/Arf deficiency promote 


aggressiveness of pancreatic cancer by induction of EMT consistent with 
cancer stem cell phenotype. J. Cell. Physiol. 228, 556-562 (2013). 


. Yang, J. et al. Twist, a master regulator of morphogenesis, plays an essential 


role in tumor metastasis. Ce// 117, 927-939 (2004). 


. Vega, S. et al. Snail blocks the cell cycle and confers resistance to cell death. 


Genes Dev. 18, 1131-1143 (2004). 
Shah, A. N. et a/. Development and characterization of gemcitabine-resistant 
pancreatic tumor cells. Ann. Surg. Oncol. 14, 3629-3637 (2007). 


. Yin, T. et al. Expression of snail in pancreatic cancer promotes metastasis and 


chemoresistance. J. Surg. Res. 141, 196-203 (2007). 


. Wang, Z. et al. Acquisition of epithelial-mesenchymal transition phenotype of 


gemcitabine-resistant pancreatic cancer cells is linked with activation of the 
notch signaling pathway. Cancer Res. 69, 2400-2407 (2009). 

Alagesan, B. et al. Combined MEK and PI3K inhibition in a mouse model of 
pancreatic cancer. Clin. Cancer Res. 21, 396-404 (2015). 

Cursons, J. et al. Stimulus-dependent differences in signalling regulate 
epithelial-mesenchymal plasticity and change the effects of drugs in breast 
cancer cell lines. Cell Commun. Signal. 13, 26 (2015). 


530 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


24. Javie, M. M. et a/. Epithelial-mesenchymal transition (EMT) and activated 
extracellular signal-regulated kinase (p-Erk) in surgically resected pancreatic 
cancer. Ann. Surg. Oncol. 14, 3527-3533 (2007). 

25. Masugi, Y. et al. Solitary cell infiltration is a novel indicator of poor prognosis 
and epithelial-mesenchymal transition in pancreatic cancer. Hum. Pathol. 41, 
1061-1068 (2010). 

26. Park, J. Y. et al. Pdx1 expression in pancreatic precursor lesions and 
neoplasms. Appl. Immunohistochem. Mol. Morphol. 19, 444-449 (2011). 

27. Offield, M. F. et al. PDX-1 is required for pancreatic outgrowth and 
differentiation of the rostral duodenum. Development 122, 983-995 
(1996). 

28. Hidalgo, M. Pancreatic cancer. N. Engl. J. Med. 362, 1605-1617 (2010). 

29. Kleger, A., Perkhofer, L. & Seufferlein, T. Smarter drugs emerging in pancreatic 
cancer therapy. Ann. Oncol. 25, 1260-1270 (2014). 

30. Gore, A. J., Deitz, S. L., Palam, L. R., Craven, K. E. & Korc, M. Pancreatic 
cancer-associated retinoblastoma 1 dysfunction enables TGF-8 to promote 
proliferation. J. Clin. Invest. 124, 338-352 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We wish to thank D. Lundy, S. Yang, Z. Xiao, 

R. Deliz-Aguirre, T. Miyake and S. Lovisa for technical support and 

K. M. Ramirez and R. Jewell in the South Campus Flow Cytometry Core 
Laboratory of MD Anderson Cancer Center for flow cytometry cell sorting and 
analyses (partly supported by NCI grant no. P30CA16672). We also wish to 
thank E. Chang for scanning slides of histopathological specimens. This study 
was primarily supported by the Cancer Prevention and Research Institute 

of Texas. The research in the LeBleu laboratory is supported by UT MDACC 
Khalifa Bin Zayed Al Nahya Foundation. 


Author Contributions R.K. conceptually designed the strategy for this 

study and provided intellectual input. V.S.L. helped design experimental 
strategy, provided intellectual input, supervised the studies, performed 
immunohistochemistry and culture experiments, generated the figures 

and wrote the manuscript. X.Z. performed experiments to generate the 
genetically engineered mouse models and helped characterize the mouse 
phenotype, performed culture experiments, collected the tissue for analysis 
and contributed to the manuscript writing. J.L.C. characterized the mouse 
phenotype, analysed the data related to the genetically engineered mouse 
models, collected data, generated the figures and helped with manuscript 
writing and editing. H.S. performed experiments with mice and injected 
cancer cells and helped collect tissue, J.Ki., M.S., J.Ka., and C.-C.W. performed 
experiments and collected data. The data was analysed by J.L.C, V.S.L,, X.Z., 
J.Ki, and C.-C.W. 


Author Information Gene expression microarray data have been deposited 
in the Gene Expression Omnibus under accession number GSE66981. 
Reprints and permissions information is available at www.nature.com/ 
reprints. The authors declare no competing financial interests. Readers 
are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

R.K. (rkalluri@mdanderson.org). 


METHODS 

Mice. Characterization of disease progression and genotyping for the Pdx1- 
cre;LSL-Kras@?, P53®!”74/+ (herein referred to as KPC) and Ptfla (P48)-cre;LSL- 
Kras“”>;Tgfbr2 (herein referred to as KTC) mice were previously described*!~*5, 
These mice were bred to Snail!" (herein referred to as Snail**°), Twist1/“ 
(herein referred to as Twist*°), and R26-LSL-EYFP*’. Snail**© mice were kindly 
provided by S. J. Weiss. Twist*X° mice were kindly provided by R. R. Behringer 
via the Mutant Mouse Regional Resource Center (MMRRC) repository. The 
resulting progeny were referred to as KPC, KPC;Snail“*°, KPC;Twist*®°, KTC 
and KTC;Snail‘®° mice and were maintained on a mixed genetic background. 
Both males and females were used indiscriminately. Mice were given gemcitabine 
(G-4177, LC Laboratories) via intraperitoneal injection (i.p.) every other day at 
50mgkg | of body weight. Hypoxyprobe was injected in a subset of mice i.p. 
at 60 mg kg“! of body weight 30 min before euthanasia. For in vivo colonization 
assays, one million KPC, KPC;Twist**° and KPC;Snail“° tumour cells in 100 ul of 
PBS were injected intravenously via the retro-orbital venous sinus. Four to eleven 
mice were injected per cell line. All mice were euthanized at 15 days post injection. 
All mice were housed under standard housing conditions at MD Anderson Cancer 
Center (MDACC) animal facilities, and all animal procedures were reviewed and 
approved by the MDACC Institutional Animal Care and Use Committee. Tumour 
growth met the standard of a diameter less than or equal to 1.5 cm. Investigators 
were not blinded to group allocation but were blinded for the assessment of the 
phenotypic outcome by histological analyses. No statistical methods were used to 
predetermine sample size and the experiments were not randomized. 

Histology and histopathology. Histology, histopathological scoring, Masson's 
trichrome staining (MTS), and Picrosirius Red have been previously described’**?. 
Formalin-fixed tissues were embedded in paraffin and sectioned at 5 ,1m thick- 
ness. MTS was performed using Gomori’s Trichome Stain Kit (38016SS2, Leica 
Biosystems). Picrosirius red staining for collagen was performed using 0.1% picro- 
sirius red (Direct Red80; Sigma) and counterstained with Weigert’s haematoxylin. 
Sections were also stained with haematoxylin and eosin (H&E). Histopathological 
measurements were assessed by scoring H&E-stained tumours for relative per- 
centages of each histopathological phenotype: normal (non-neoplastic), PanIN, 
well-differentiated PDAC, moderately-differentiated PDAC, poorly-differentiated 
PDAC, sarcomatoid carcinoma, or necrosis. When tumour histology was missing 
or of poor quality, the mice were excluded from primary tumour histological anal- 
ysis and this was determined blinded from genotype information. A histological 
invasion score of the tumour cells into the surrounding stroma was scored on a 
scale of 0 to 2, with 0 indicating no invasion and 2 indicating high invasion, where 
invasion is defined as tumour cell dissemination throughout the stroma away from 
clearly defined epithelial ‘nests. Microscopic metastases were observed in H&E- 
stained tissue sections of the liver, lung and spleen. Positivity (one or more lesions 
ina tissue) was confirmed using CK19 and YFP immunohistochemistry. This data 
has been presented as a contingency table (Fig. 2e) and represented as the number 
of positive tissues out of the number of tissues scored. The ‘Any’ metastasis score 
is the number of mice positive for a secondary lesion found anywhere throughout 
the body out of the total number of mice scored. 

Immunohistochemistry and Immunofluorescence. Tissues were fixed in 10% 
formalin overnight, dehydrated, and embedded in paraffin and 5-j1m-thick sections 
were then processed for analyses. Immunohistochemical analysis was performed 
as described**. Heat-mediated antigen retrieval in 1 mM EDTA + 0.05% Tween20 
(pH 8.0) for one hour (pressure cooker) was performed for Snail and Twist, 10 mM 
citrate buffer, pH 6.0, was used for one hour (microwave) for Ki67 or 10 min for 
all other antibodies. Primary antibodies are as follows: SMA (M0851, DAKO, 
1:400 or ab5694, Abcam, 1:400), cleaved caspase-3 (9661, Cell Signaling, 1:200), 
CD3 (A0452, DAKO, 1:200), CD31 (Dia310M, DiaNova, 1:10), CK8 (TROMA-1, 
Developmental Studies Hybridoma Bank, 1:50), CK19 (ab52625, Abcam, 1:100), 
Cnt3 (HPA023311, Sigma-Aldrich, 1:400), ENT1 (LS-B3385, LifeSpan Bio., 1:100), 
E-cadherin (3195S, Cell Signaling, 1:400), ENT2 (ab48595, Abcam, 1:200), Ki67 
(RM-9106, Thermo Scientific, 1:400), Slug (9585, Cell Signaling, 1:200), Snail 
(ab180714, Abcam, 1:100), Sox4 (ab86809, Abcam, 1:200), Twist (ab50581, Abcam, 
1:100), YFP (ab13970, Abcam, 1:1000), Zeb1 (NBP1-05987, Novus, 1:500), and 
Zeb2 (NBP1-82991, Novus, 1:100). Sections for pimonidazole adduct (HPI Inc., 
1:50) or aSMA immunohistochemistry staining were blocked with M.O.M. kit 
(Vector Laboratories, West Grove, PA) and developed by DAB according to the 
manufacturer’s recommendations. Alternatively, for immunofluorescence, sections 
were dual-labelled using secondary antibodies conjugated to Alexa Fluor 488 or 
594 or tyramide signal amplification (TSA, PerkinElmer) conjugated to FITC. 
Lineage-traced (YFP-positive) EMT analysis was performed on 8-j1m-thick O.C.T. 
medium (TissueTek)-embedded frozen sections. Sections were stained for aSMA 
(ab5694, Abcam, 1:400) followed by Alexa Fluor 680 conjugated secondary anti- 
body. Bright-field imagery was obtained on a Leica DM1000 light microscope or 


LETTER 


the Perkin Elmer 3DHistotech Slide Scanner. Fluorescence imagery was obtained 
on a Zeiss Axio Imager.M2 or the Perkin Elmer Vectra Multispectral imaging 
platform. The images were quantified for per cent positive area using NIH ImageJ 
analysis software (aSMA, Pimonidazole, Slug, and CD31), per cent positive cells 
using InForm analysis software (Ki67 and CD3), or scored for intensity either 
positive or negative (aSMA/CKS8 dual staining, aSMA, CK19, YFP, Zeb1, Zeb2, 
Sox4, E-cadherin and cleaved caspase-3) or on a scale of 1-3 (E-cadherin) or 1-4 
(ENT1, ENT2 and Cnt3). 

In situ hybridization. In situ hybridization (ISH) was performed on frozen 
tumour sections as previously described**. In brief, 10-j1m-thick sections 
were hybridized with antisense probes to Twist1 and Snail overnight at 65°C. 
After hybridization, sections were washed and incubated with AP-conjugated 
sheep anti-DIG antibody (1:2,000; Roche) for 90 min at room temperature. 
After three washes, sections were incubated in BM Purple (Roche) until posi- 
tive staining was seen. Digoxigenin-labelled in situ riboprobes were generated 
with an in vitro transcription method (Promega and Roche) using a PCR tem- 
plate. The following primers were used to generate the template PCR product. 
Twist1, forward, 5’-CGGCCAGGTACATCGACTTC-3’; reverse, 5/-TAATACG 
ACTCACTATAGGGAGATTTAAAAGTGTGCCCCACGC-3’; Snail, forward, 
5'-CAACCGTGCTTTTGCTGAC-3’; reverse, 5’-TAATACGACTCACTATAGG 
GAGACCTTTAAAATGTAAACATCTTTCTCC-3’. 

Gene expression profiling. Total RNA was isolated from tumours of KPC 
control, KPC;Twist"*° and KPC;Snail“®° mice (n= 3 in each group) by TRIzol 
(15596026, Life Technologies) and submitted to the Microarray Core Facility at 
MD Anderson Cancer Center. Gene expression analysis was performed using 
MouseWG-6 v2.0 Gene Expression BeadChip (Illumina). The Limma package 
from R Bioconductor® was used for quantile normalization of expression arrays 
and to analyse differentially expressed genes between cKO and control sample 
groups. Gene expression microarray data have been deposited in GEO (Accession 
number GSE66981). Genes upregulated in cells acquiring an EMT program were 
expected to be downregulated in the Twist*° and Snail“° tumours compared to 
control tumours. 

CTC assays. Blood (200,11) was collected from KPC;LSL-YFP and 
KPC;Twist**°;LSL-YFP (ROSA-LSL-YEP lineage tracing of cancer cells) mice and 
incubated with 10 ml of ACK lysis buffer (A1049201, Gibco) at room temperature 
to lyse red blood cells. Cell pellets were resuspended in 2% FBS containing PBS and 
analysed for the number of YFP* cells by flow cytometry (BD LSRFortessa X-20 
Cell Analyzer). The data was expressed as the percentage of YFP* cells from gated 
cells, with 100,000 cells analysed at the time of acquisition. Whole blood cell pellets 
were also assayed for the expression of Kras®!”° transcripts, using quantitative 
real-time PCR analyses (described below). 

Primary pancreatic adenocarcinoma cell culture and analyses. Derivation of 
primary PDAC cell lines were performed as previously described*®, Fresh tumours 
were minced with sterile razor blades, digested with dispase II (17105041, Gibco, 
4mg ml !)/collagenase IV (17104019, Gibco, 4mg ml~ 1)/RPMI for 1 h at 37°C, 
filtered by a 701m cell strainer, resuspended in RPMI/20%FBS and then seeded on 
collagen I-coated plates (087747, Fisher Scientific). Cells were maintained in RPMI 
medium with 20% FBS and 1% penicillin, streptomycin and amphotericin B (PSA) 
antibiotic mixture. Cancer cells were further purified by FACS based on YFP or 
E-cadherin expression (anti-E-cadherin antibody, 50-3249-82, eBioscience, 1:100). 
The sorted cells, using BD FACSAria™ II sorter (South Campus Flow Cytometry 
Core Lab of MD Anderson Cancer Center) were subsequently expanded in vitro. 
All studies were performed on cells cultivated less than 30 passages. As these are 
primary cell lines, no further authentication methods were applicable and no myco- 
plasma tests were performed. 

MTT and drug sensitivity assays. MTT assay was performed to detect cell prolif- 
eration and viability by using Thiazolyl Blue Tetrazolium Bromide (MTT, M2128, 
Sigma) following the manufacturer’s recommendations with an incubation of two 
hours at 37°C. For the drug treatment studies, a cell line derived from each of the 
KPC, KPC;Snail*° and KPC;Twist*° mice was treated with 201M gemcitabine 
(G-4177, LC Laboratories) or 100|1M erlotinib (5083S, NEB) for 48h. The relative 
cell viability was detected using MTT assay with a cell line derived from each of 
the KPC, KPC;Snail“®° and KPC;Twist® mice. n is defined as the number of 
biological replicates of a single cell line. Control conditions included 1% DMSO 
vehicle for erlotinib. The relative absorbance was normalized and control (time 
Oh or vehicle-treated) arbitrarily set to 1 or 100% for absorbance or drug survival, 
respectively. 

Quantitative real-time PCR analyses (qPCR). RNA was extracted from 
whole blood cell pellets following ACK lysis using the PicoPure Extraction 
kit as directed (KIT0214, Arcturus), or from cultured primary pancreatic 
adenocarcinoma cells using TRIzol (15596026, Life Technologies). cDNA 
was synthetized using TaqMan Reverse Transcription Reagents (N8080234, 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Applied Biosystems) or High Capacity cDNA Reverse Transcription Kit 
(4368814, Applied Biosystems). Primers for Kras®!?? recombination are: 
Kras°!??, forward , 5’-ACTTGTGGTGGTTGGAGCAGC-3’; reverse, 
5'-TAGGGTCATACTCATCCACAA-3’. 1/AC, values are presented to show 
Kras©' expression in indicated experimental groups, statistical analyses 
were performed on AC,. Primer sequences for EMT-related genes are listed in 
Supplementary Table 1, GAPDH was used as an internal control. The data are pre- 
sented as the relative fold change and statistical analyses were performed on AC. 
Tumour sphere assay. Tumour sphere assays were performed as previously 
described**. Two million cultured primary tumour cells were plated in a low- 
adherence 100-mm dish (FB0875713, Fisherbrand) with 1% FBS, Dulbecco's 
modified Eagle's medium, and penicillin/streptomycin/amphotericin. Cells were 
incubated for 7 days and formed spheres were counted at 100 magnification. 
Three, two and three cell lines were analysed for KPC control, KPC;Twist“*° and 
KPC;Snail**° groups, respectively, five field of views per cell line were quantified. 
MRI analyses. MRI imaging was performed using a 7T small animal MR sys- 
tem as previously described*’. To measure tumour volume, suspected regions 
were drawn blinded on each slice based on normalized intensities. The volume 
was calculated by the addition of delineated regions of interest in mm? x 1mm 
slice distance. None of the mice had a tumour burden that exceeded 1.5cm in 
diameter, in accordance with institutional regulations. All mice with measurable 
tumours were enrolled in the study (see Extended Data Table 3). Mice were imaged 
twice, once at the beginning of the enrolment (day 0), and a second time 20 days 
(day 19) afterwards. Surviving animals were euthanized at end point (day 21) for 
histological characterization. 

Statistical analyses. Statistical analyses were performed on the mean values of 
biological replicates in each group using unpaired two-tailed or one-tailed t-tests 
(qPCR only), or one-way ANOVA with Tukey’s multiple comparisons test using 


GraphPad Prism, as stipulated in the figure legends. \* analyses, using SPSS sta- 
tistical software, were performed comparing control to cKO groups for metastatic 
or colonization frequency across multiple histological parameters in all mice and 
mice >120 days of age in Extended Data Table 1. Fisher’s exact P value was used 
to determine significance. Results are outlined in Extended Data Table 2. Kaplan- 
Meier plots were drawn for survival analysis and the log rank Mantel-Cox test 
was used to evaluate statistical differences, using GraphPad Prism. Data met the 
assumptions of each statistical test, where variance was not equal (determined by an 
F-test) Welch's correction for unequal variances was applied. Error bars represent 
s.e.m. when multiple visual fields were averaged to produce a single value for each 
animal which was then averaged again to represent the mean bar for the group in 
each graph. P< 0.05 was considered statistically significant. 


31. Hingorani, S. R. et a/. Trp53R172H and KrasG12D cooperate to promote 
chromosomal instability and widely metastatic pancreatic ductal 
adenocarcinoma in mice. Cancer Cell 7, 469-483 (2005). 

32. ljichi, H. et a/. Aggressive pancreatic ductal adenocarcinoma in mice caused by 
pancreas-specific blockade of transforming growth factor- signaling in 
cooperation with active Kras expression. Genes Dev. 20, 3147-3160 (2006). 

33. Ozdemir, B.C. et al. Depletion of carcinoma-associated fibroblasts and fibrosis 
induces immunosuppression and accelerates pancreas cancer with reduced 
survival. Cancer Cell 25, 719-734 (2014). 

34. Keskin, D. et al. Targeting vascular pericytes in hypoxic tumors increases lung 
metastasis via angiopoietin-2. Celi Rep. 10, 1066-1081 (2015). 

35. Smyth, G. K. Limma: linear models for microarray data. In Bioinformatics and 
Computational Biology Solutions using R and Bioconductor (eds Gentleman, R., 
Carey, V., Dudoit, S., Irizarry, R. & Huber, W.) (Springer, 2005). 

36. Ying, H. et al. Oncogenic Kras maintains pancreatic tumors through regulation 
of anabolic glucose metabolism. Cel! 149, 656-670 (2012). 

37. Melo, S. A. et a/. Glypican-1 identifies cancer exosomes and detects early 
pancreatic cancer. Nature 523, 177-182 (2015). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


KPC KPC; as ; KPC; Snailck° 


Pancreas mass (g) 


Snai1/IDRAQS/CK8 = Twist1/DRAQ5/CK8 


KPC; ae KPC KPC; SnaileK° 

, 3 if he | | &, = Ea ae ‘en pore fis. age” wre 
“2. ov | , $ Mie im :¥ ‘ iad pe _ ¥ ay ite <* =! 
Ga ee eae wy 
- as RSs ¥ go, & | 
BS FOSS 
eee OF 1 WO Soe 
CB EELS, al th Beak of 

..  P on |) An ae Ee 
Raden Sn) Ca Sls 

Merge Zoom 


KPC; 
LSL-YFP 


| ee es ies rat 
go, O& 
mM WL 
=> 
F I 
oni 
oOo 
< Down-regulated 
Up-regulated in EMT 
KPC 
KPC; Snail**° 
KPC; Twistek° 
an 
ao 
Extended Data Figure 1 | EMT inhibition is specific to tumour in KPC and KPC;Twist*° or KPC;Snail*° tumours, respectively. Black 
epithelium. a, Representative images of haematoxylin and eosin-stained arrows highlight positive cells in the stroma, red arrows highlight negative 
small intestine (SmInt), kidney, and heart (scale bar, 100j1m). b, Pancreatic —_ epithelium (scale bar, 20,1m). e, Channel separations of the representative 
mass of 29 (KPC), 13 (KPC;Twist*°) and 28 (KPC;Snail®°) mice, error images of ~SMA immunolabelling in YFP lineage-traced tumours found 
bars represent s.d.; one-way ANOVA. c, Merge of Twist1 or Snail in situ in Fig. 1f (scale bar, 50,1m). f, EMT gene expression signature analysis in 
hybridization (black) followed by CK8 (red) immunolabelling in tumours KPC, KPC;Twist**° and KPC;Snail“° cohorts (n = 3 mice). Red arrows 
from KPC and KPC;Twist**° or KPC;Snail“*° mice, respectively. White indicate reduced Twist1 and Snail expression in KPC;Twist®° and 
arrows highlight positive cells in the stroma, yellow arrows highlight KPC;Snail*®° cohorts, respectively. 


negative epithelium (scale bar, 20 um). d, Twist or Snail immunostaining 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


-Twist**© KPC; Snailexo 


E-cadherin Intensity Score/ 
visual field (200x) 


15° 


visual field (400%) 
8 
z 
8 
# 


re 
Yy, s] 


No. of ZEB2* PDAC cells/ 


No, of SOX4* PDAC cells/ 
visual field (40x) 
a 


visual field (400x) 


Percent Slug’ area/ 


Sirius Red 


7 


Percent aSMA* area/ 
visual field (200%) 


aSMA 


Percent CD31* area/ 
visual field (100x) 


CD31 


Percent Hypoxic area/ 
visual field (200x) 


Pimonidazole 


Extended Data Figure 2 | General suppression of EMT markers (scale bar, 100,1m; error bars represent s.d.) f, “SMA immunolabelling and 
does not affect desmoplasia. a, E-cadherin immunolabelling and quantification of primary KPC (n=5 mice), KPC;Twist*®° (n= 5 mice) and 
quantification of primary KPC (n=5 mice), KPC;Twist“®° (n= 5 mice) KPC;Snail**° (n= 5 mice) (scale bar, 100,1m). g, CD31 immunolabelling 


and KPC;Snail‘®° (1 = 4 mice) (scale bar, 100 um). b, Zeb2 immunolabelling —_ and quantification of primary KPC (n= 4 mice), KPC;Twist*®° (n= 4 
and quantification of primary KPC (n= 6 mice), KPC;Twist*° (n= 5 mice) mice) and KPC;Snail*° (n = 3 mice) (scale bar, 200 1m, inset scale bar, 


and KPC;Snail*° (1 = 7 mice) (scale bar, 50 um; inset scale bar, 20 um). 100\1m). h, Pimonidazole staining and quantification of primary KPC (n= 4 
c, Sox4 immunolabelling and quantification of primary KPC (n =7 mice), mice), KPC;Twist*®° (n = 4 mice) and KPC;Snail“*° (n= 4 mice) (scale 
KPC;Twist*° (n= 6 mice) and KPC;Snail®° (n= 8 mice) (scale bar, bar, 100j1m). i, CD3 immunolabelling and quantification of primary KPC 

50 um; inset scale bar, 20 um). d, Slug immunolabelling and quantification (n=5 mice), KPC;Twist*° (n= 5 mice) and KPC;Snail“*° (n= 5 mice) 

of primary KPC (n= 4 mice), KPC;Twist**° (n = 4 mice) and KPC; (scale bar, 100 1m; inset scale bar, 251m). Unless otherwise indicated error 
Snail‘®° (n = 4 mice) tumours (scale bar, 501m; inset scale bar, 20|1m). bars represent s.e.m., and significance determined by one-way ANOVA. 

e, Sirius Red staining and quantification of primary KPC (n = 21 mice), *P<0.05, **P< 0.01, ***P < 0.001; ns, not significant. 


KPC;Twist*° (n= 8 mice) and KPC;Snail**° (n = 11 mice) 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


A _CK8/aSMA/Ki-67/DAPI C 


B 


KPC; LSL-YFP — KPC; Twist*°; LSL-YFP 


150K: 


5s ‘ Ane 
KPC; Snaile° 
@ Se ; 7 Ce: 28 | re ~, oe Spee? 
E erry ets ## : SESS 3 


ie gen 28 
co Rg a Ms Ere 
N 


ar Hoe a 
3 


@ 


> 
e 


88% | 91% 


wy 


No. of ZEB1* metastatic 
tumor cells/visual field (400x) 
& 


° 


aSMA 


No. of aSMA* metastatic 
tumor cells/visual field (400x) 
oo = = Ww 
So um FO H oO 
° e 
Fe 2, 
S : * 
ey 
g 
"ro 0. BS 
Ss 
gS 
x 


° 
i} 
e 
e 


a 
is} 


aes 


pe 
No. of E-cadherin* metastatic 
tumor cells/visual field (400x) 
e 


i 


E-cadherin 


pase 


- 


2S 


Be ns ie 
; Bg 
E ar 
0 . " 
OES 
Extended Data Figure 3 | EMT suppression does not alter epithelial f, SMA immunolabelling and quantification of metastatic KPC (n= 3 
characteristics of metastases. a, Immunolabelling of primary tumours mice), KPC;Twist“*° (n= 3 mice) and KPC;Snail®° (1 = 3 mice) (scale bar, 
(n= 3 mice) for aSMA (red), CK8 (green), Ki67 (white) and DAPI (blue); 50 um; inset scale bar, 201m). g, E-cadherin staining on serial sections of 
yellow arrows indicate EMT* cells (scale bar, 20 um). b, Representative oaSMA immunolabelling and quantification of metastatic KPC (n= 4 mice), 
dot plots of circulating YFP* cells. c, Images of serial sections of KPC;Twist°®° (n = 3 mice) and KPC;Snail*®° (n= 4 mice) (scale bar, 50,1m; 
KPC;LSL-YFP lung and liver metastasis stained for haematoxylin and inset scale bar, 20,1m). h, Ki67 immunolabelling and quantification of 
eosin or immunolabelled for CK19 or YFP. Yellow dashed box represents metastatic KPC (n=7 mice), KPC;Twist“*° (n = 3 mice) and KPC;Snail**° 
magnified areas in panel below (scale bar, 200 um; magnification scale (n= 3 mice) (scale bar, 501m; inset scale bar, 20j1m). Unless otherwise 
bar, 50 um). d, KPC metastatic tumours stained for Twist and Snail (n = 3 indicated error bars represent s.e.m., percentages indicated represent per 
mice; scale bar, 20 um; inset scale bar, 10j1m).e, Zeb1 immunolabelling and _ cent decrease from control, and significance was determined by one-way 
quantification of metastatic KPC (n= 4 mice), KPC;Twist*®° (n= 3 mice) ANOVA. *P< 0.05, **P< 0.01, ***P< 0.001; ns, not significant. 


and KPC;Snail“®° (n= 4 mice) (scale bar, 50 jm; inset scale bar, 20|1m). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Hi kpc 


4 F 
B HB KPc; Twist*° 
5 3 Hl KPC; SnaileK° 
ao — 
n oO 2: 
oD 
S SI 0.2480 
®S 10 0.3031 
g 3 ‘ 0.0851 
a = 
So 0.0668 * 
2 05 
oos74  * * : 0.1006 
ne 
0.0: 


Snail Twist! Zeb1 


‘?) 


» 297 @ KPC 
4 @ KPC; Twist 
& 2.59 @ KPC; Snail*X° 
5 KKK 
QD KKK 
2 2.0 
jo) 
2 
@ 1.5 
® 
jag 
0 12 
Time (hrs) 


Extended Data Figure 4 | EMT suppressed primary tumour cells have 
reduced mesenchymal markers and show resistance to chemotherapy 
in vitro. a, Bright-field micrograph of cultured primary KPC, 
KPC;Twist"*° and KPC;Snail‘*® cells (scale bar, 50 um). b, EMT- and 
gemcitabine-transport-related gene expression shown by qPCR analysis in 
KPC (n= 3-4 cell lines), KPC;Twist©®° (n= 5 cell lines) and KPC;Snail°*° 
(n= 5-6 cell lines) (error bars represent s.d., one-tailed t-test, *P < 0.05, 


numbers list non-significant P values. nd, not detected, ns, not significant). 


Zeb2 Slug Vimentin FoxC2 Cdh2 Cdh1 


Relaive expression 
(fold change) 


Ent1 Ent2 Ent3 
120 
100 
kk KKKK 
80: KKK kk KKKK KKKK 
mm 1re1 [ es ee | 


Relative Survival 
a 
fo} 


Erlotinib 


Gemcitabine 


c, MTT assay showing cell proliferation in KPC, KPC;Twist*° and 
KPC;Snail**® cells (1 = 8, 8 and 8 biological replicates of a cell line for 
each genotype). d, Relative cell viability (MTT assay) in cultured 
KPC, KPC;Twist*° and KPC;Snail*®° cells treated with gemcitabine 
or erlotinib (n = 8, 8 and 8 biological replicates of a cell line for each 
genotype). Unless otherwise indicated error bars represent s.e.m., 
significance was determined by one-way ANOVA. **P< 0.01, 

ED < 0.001, ****P < 0.0001. 


© 2015 Macmillan Publishers Limited. All rights reserved 


KTC; SnaileK° 
Tee, + *, 


‘¢ 


Relative Percentage 
of Tissue Phenotype 


Mm 


aSMA/CK8/DAPI 


Cleaved 
Caspase-3 


is = 
2 3 aS 
go Oo 9 15 ns 
4. x x gt e 
es 58 a3 
OL si & = 10 
<3 72 53S 
=o ne Bz 
Q> = ze 
isu 
2s N§ BB 5 
aan) = @ Oo 
oe os =: 
= fo) a) 
7 © * © 25° 
ae . Oo oo 
o CF e Cs oe eF 
® ® S 
Ss Ss Ss 


Extended Data Figure 5 | EMT inhibition in KTC mice mirrors 
phenotype observed in KPC mice. a, Representative images of 
haematoxylin and eosin-stained primary tumours (scale bar, 100 um). 

b, Relative percentage of each histological tissue phenotype of KTC (n= 8 
mice) and KTC;Snail*®° (n= 6 mice) primary tumours (error bars represent 
s.d.). c, Primary tumour invasiveness in KTC (n= 8 mice) and KTC;Snail“*° 
(n= 6 mice) (error bars represent s.d.). d, Pancreatic mass in KTC (n=5 
mice) and KTC;Snail*®° (= 6 mice) (error bars represent s.d.). 


Ue) 


LETTER 


C 


o 3] Ns _ 
a (2) 
1 HB Necrosis 8 2 
HB Poor 8 a) ee . = 
HH Moderate @ 
50 z 3 
DS Well a 1 8 
[Pann 8 & 
[J Normal 0 

© '- 

s 

eS 

nailck° 


SS SS 
ee ey 


2 


30 < _ 4 ° 
Be y 1 gz oo 
= x x 
gs © 8831 ge BS 3 
£20 au 2a 
& ‘é 2 = 
23 5 o2 § @ 2 
a os ee 
® S 10: =o =3 
£3 31 e221 
e> a> 3° 
0 5 0 G 0 
< Gyo < GO Gyo 
ee > es e ee 
SF Ss s 


e, Immunolabelling and quantification of primary KTC (n=5 mice), 
KTC;Snail®° (n= 4 mice) for aSMA (red), CK8 (green) and DAPI (blue); 
white arrows indicate double-positive cells (scale bar, 20 um), Zeb1 (scale 
bar, 50 um; inset scale bar, 20 um), cleaved caspase-3 (scale bar, 50 um; n= 4 
mice for both groups), Ki67 (scale bar, 100j1m), ENT2 (scale bar, 100|1m) 
and CNT3 (scale bar, 1001m); error bars represent s.e.m. Significance was 
determined by two-tailed t-test. *P < 0.05, ***P < 0.001; ns, not significant. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


KTC; SnailcK° 


8 Avs 


Extended Data Figure 6 | Desmoplasia is unaffected in EMT suppressed 
tumours with or without gemcitabine. a, b, Staining and quantification 
of KTC (n=5 or 6 mice), KTC;Snail**° (n = 4 or 5 mice), KTC plus 
gemcitabine (+ GEM; n= 4 or 5 mice), KTC;Snail“° + GEM (n=5 


KTC; Snail° + GEM 


~ 


KTC; Snail*© 
NN, 


ns 


oO 
g 
fo} 
— [o} 
3 > 
ac OD 
nan i= 
g 2 s 
= 
o = 
uw 


Relative Collagen Deposition 
visual field (200x) 


Relative Collagen Deposition 


KTC + GEM 


a 


visual field (200) 


Cc 
s 
= 

oO 

fo} 

fom 

poy 
a 
S 
5 
Ww 

oO 
2 
8 

®D 
© 


Relative ECM Deposition 
(Sirius Red) 
NO 
ENT1 intesity score/ 


xs ss » 
So SS 
x ro o*% 

S s 


mice) for Masson's trichrome stain (MTS) (scale bars, 100 um), Sirius Red 
staining (scale bars, 100 um), and ENT1 (scale bars, 100 um). Error bars 
represent s.d. (MTS and Sirius Red) or s.e.m. (ENT 1), and significance was 
determined by two-tailed t-test. ns, not significant. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Pathological spectrum of primary disease and metastasis in KPC, KPC;TwistcKO and KPC;Snail“° cohorts 


Pathological Spectrum within cohorts 


ID AGE PDA Differentiation a eee Liver ai sae ae Moribund 


1 158 Y Ww Ss G Y ae N Y ¥ 
2 165 Y Ww G N N N N Né 
3 148 Y P Ss G N N 7 N Y 
4 135 Y M Ss G Y N Y Y Ms 
5 95 Y M G N Y N ¥ N 
6 42 Y M G N N N N Y 
7 55 Y P G Ss Y N N Y Y 
8 91 Y M G N N N N N 
9 87 Y Ww G N N N N N 
10 63 Y Pe G Y Y Y Y N 
11 108 Y P Ss G Y N N Y FD 
12 110 Y Ww G N N N N N 
13 104 Y Ww G Y N N Ys Y 
14 54 Y Ww Ss G N N N N Y 
15 108 Y P Ss G N Y N Ng ¥. 
16 42 x P Ss G N N N N Nf 
17 68 Y Ww G N N N N N 
18 107 ¥ PR G N N N N N 
19 87 as P G N N N N N 
20 48 a4 P G Ss N N N N Y 
21 109 24 P G Ss Y ¥. N Y FD 
22 81 Ms P G Y ¥. N Y Y 
23 151 nd WwW G N ¥. N Y Y 
24 47 ¥ M G Ss N N N Y Y 
25 143 ¥ P G Ss N N N Y Y 
26 122 ¥ WwW G Y N N Y N 
27 115 ¥ P. G Y Y N Y N 
28 76 ¥ Ww G N Y N Y N 
29 122 ¥: M Ss G Y N N Y Y 
30 97 Y P G N N N N N 
31 107 ¥. WwW. G N N N N N 
Totals (Median) 31/31 11/31 11/31 2/30 17/31 

% 100.0% 35.5% 35.5% 6.7% 54.8% 

1 148 Y Ww G Ss Y N N Me N 
2 151 Y la Ss G Y ¥ Y Y N 
3 140 Y P G Y bf N n 6 ¥ 
4 53 Y P G Ss N N N N Y 
5 43 Y P G N N N N Y 
6 117 Y P G Ss N N N N N 
f 90 Y P Ss G Y N N Y Y 
8 52 Y P G Ss N N N N Y 
9 104 Y P G N N N N N 
10 218 Y RP G Ss N N Y Y Y 
11 153 Y P. G N Y N Y Y 
12 45 Y P G Ss N N N N Y 
13 77 Y P G Ss Y N N ¥ Y 
14 126 Y P G iS} Y Y N vi Y 
Totals (Median) 14/14 6/14 44 2/4 8/14 

% 100.0% 42.9% 28.6% 14.3% 57.1% 

1 144 Y Ww G N ¥ N Me N 
2 51 Y P G Ss N N N N Na 
3 105 Y P G Ss N ¥ N Me Ns 
4 111 Y P G N N N N N 
5 106 Y P G Ss Y N Y Y =v 
6 129 Y P G N N N N N 
7 102 Y P' G Ss N nd - Y N 
8 98 Y P G Ss Y N Y Y N 
9 47 Y P G Ss N N N N Y 
10 54 Y Ww G Y mw N Y FD 
an 59 Y M G Y N N Y N 
12 103 Y P G Y N N Y N 
13 60 Y P Ss G Y N Y Y ¥: 
14 77 Y P G Y N N Y Y 
15 57 Y M Ss G Y N N n4 FD 
16 130 Y P G Y ¥ N Y FD 
17 76 Y a G Ss N N N N FD 
18 111 Y a G N ¥ N Y yy; 
19 100 Y PR G Ss Y N Y Y FD 
20 104 Y P G Ss Y N N Y Né 
21 124 Y M G N N N N FD 
22 88 Y P G Ss N N N N Y 
23 192 Y Ww G Y ¥. N Y h 4 
24 122 Y P G N N N N Y 
25 60 Y Ww G Ss N N N N Y 
26 112 x Ww G N Y N Y N 
27 48 Y P G Ss N N N N Y 
28 48 ¥ P G Ss N N N N Y 
29 124 Y P G Ss Y Y Y Y N 
30 215 Y W. G N N N N N 
Totals (Median) 30/30 13/30 9/30 5/29 18/30 

% 100.0% 43.3% 30.0% 17.2% 60.0% 


Y, yes; N, no; W, well; M, moderate; P, poor; G, glandular; S, sarcomatoid; FD, found dead; -, no tissue. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 2 | Results of x? analysis of KPC cohorts in Extended Data Table 1 


x2 Analysis 

Grou Parameter Fisher's Exact P value 
Control vs. Twist*K° Early Tumor progression 0.458 

Control vs. Snailek° 0.106 

Control vs. Twist*K° Late Tumor progression 0.458 

Control vs. Snailek° 0.106 

Control vs. TwisteK° Sarcomatoid 0.108 

Control vs. Snailek° 0.446 


Control vs. Twist*° Early Tumor progression 0.580 
Control vs. Snailek° 0.569 
Control vs. Twist*K° Late Tumor progression 0.580 
Control vs. Snailek° 0.569 
Control vs. TwisteK° Sarcomatoid 1.000 
Control vs. Snailek° 0.119 


Control vs. Twistek° Liver Metastasis 0.744 
Control vs. Snaile*° 0.605 
Control vs. Twistek° Lung Metastasis 0.743 
Control vs. Snaile*° 0.786 
Control vs. Twist*K° Spleen Invasion 0.581 
Control vs. Snaile° 0.254 
Control vs. Twistek° Any Metastasis 1.000 
Control vs. Snaile° 0.797 


Control vs. Twistek° Liver Metastasis 0.627 
Control vs. SnailcKO 1.000 
Control vs. Twist*K° Lung Metastasis 0.592 
Control vs. SnailcKO 1.000 
Control vs. Twist°X° Spleen Invasion 0.559 
Control vs. SnailcKO 1.000 
Control vs. Twist*K° Any Metastasis 0.473 
Control vs. SnailcKO 0.608 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 3 | Survival and primary tumour burden determined by MRI in KPC, KPC;Twist°“° and KPC;Snail°*° cohorts treated 
with gemcitabine 


KPC Gemcitabine cohorts 


Start Age Start Volume End Volume Survival 
ID (Days) (mm?) (mm?) (Days) 
1 148 1610.4 D 7 
2 72 29.7 D 13 
3 72 439.8 902.8 21* 
4 80 44.1 D 14 
5 100 536.3 592.3 21* 
6 89 167.0 D 2 
7 94 52.7 D 7 
8 122 90.2 D 14 
9 164 217.9 D 8 
10 143 212.8 D 18 
11 84 323.8 897.2 21* 
12 58 76.7 D 4 
13 58 116.2 D 8 
Mean (Median) 301.4 797.4 
Stdev 406.9 145.1 
1 117 243.0 644.2 21* 
2 75 47.2 180.0 21* 
3 75 45.4 460.9 21* 
4 78 54.6 47.5 21* 
5 46 53.7 66.5 21 
6 96 63.1 D 13 
7 90 23.9 D 13 
8 79 101.0 D 14 
9 52 28.5 D 14 
10 52 49.4 98.706 21* 
11 104 43.4 127.0 21* 
12 104 53.5 12.1 21* 
13 68 56.7 D 15 
14 122 650.1 164.1 21* 
15 104 181.8 78.6 21* 
Mean (Median) 113.0 187.9 
Stdev 154.8 193.0 
Smail+GEM (96) ay 
1 188 255.2 D 12 
2 181 854.7 D 4 
3 127 32.0 59.6 21* 
4 127 58.7 107.4 21* 
5 142 109.8 D 14 
6 54 33.6 57.2 21* 
7 89 17.0 D 13 
8 78 54.9 39.6 21* 
9 78 3.1 D 15 
10 104 209.7 134.3 21* 
11 96 220.0 280.2 21* 
12 96 24.1 46.2 21* 
13 119 711.0 D 18 
14 126 655.6 805.4 21* 
15 119 168.6 D 18 
16 82 453.8 517.4 21* 
17 82 56.7 74.1 21* 
18 90 40.0 D 16 
19 67 80.5 D 10 
20 66 49.5 226.2 21* 
Mean (Median) 204.4 213.4 
Stdev 250.7 231.7 


D, died; “euthanized at end point. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15767 


In situ structures of the segmented genome and RNA 
polymerase complex inside a dsRNA virus 


Xing Zhang!*, Ke Ding??*, Xuekui Yu*, Winston Chang!, Jingchen Sun?* & Z. Hong Zhou!?* 


Viruses in the Reoviridae, like the triple-shelled human rotavirus 
and the single-shelled insect cytoplasmic polyhedrosis virus (CPV), 
all package a genome of segmented double-stranded RNAs (dsRNAs) 
inside the viral capsid and carry out endogenous messenger RNA 
synthesis through a transcriptional enzyme complex (TEC)!. By 
direct electron-counting cryoelectron microscopy and asymmetric 
reconstruction, we have determined the organization of the dsRNA 
genome inside quiescent CPV (q-CPV) and the in situ atomic 
structures of TEC within CPV in both quiescent and transcribing 
(t-CPV) states. We show that the ten segmented dsRNAs in CPV are 
organized with ten TECs in a specific, non-symmetric manner, with 
each dsRNA segment attached directly to a TEC. The TEC consists 
of two extensively interacting subunits: an RNA-dependent RNA 
polymerase (RdRP) and an NTPase VP4. We find that the bracelet 
domain of RdRP undergoes marked conformational change when 
q-CPV is converted to t-CPV, leading to formation of the RNA 
template entry channel and access to the polymerase active site. 
An amino-terminal helix from each of two subunits of the capsid 
shell protein (CSP) interacts with VP4 and RdRP. These findings 
establish the link between sensing of environmental cues by the 
external proteins and activation of endogenous RNA transcription 
by the TEC inside the virus. 

Each capsid of viruses in the Reoviridae contains 9-12 segmented 
dsRNAs and up to 12 TECs. These RNA-containing viruses are fully 
capable of RNA transcribing and capping. Crystal structures of the 
RdRP component of the TEC have been determined for rotavirus and 
mammalian reovirus (MRV)~°, but no high-resolution in situ struc- 
ture of the TEC is available. Moreover, the organization of TECs with 
the dsRNA genome and the mechanism of transcriptional activation 
have remained unresolved, in contrast to the well understood genome 
organization inside dsDNA viruses*”. 

With only a single protein shell that encloses ten different genome 
segments, CPV is one of the simplest dsRNA viruses® and serves as a 
model system, as highlighted by its contribution to the discovery of 
RNA capping’. To gain insight into the organization of the TEC and 
segmented dsRNA genome, we have determined CPV structure in a 
quiescent (q-CPV) state at 5.1 A resolution (see Methods and Extended 
Data Figs 1 and 2). The structure reveals that each CPV contains ten 
TECs under ten specific positions of the twelve icosahedral vertices 
(Fig. 1). The two vertices without TECs are occupied by rod-like den- 
sities (Fig. la—e, Supplementary Video 1 and Extended Data Figs 3 and 
4). The previously ambiguous locations of TECs*” are now determined 
to be ten specific positions in each CPV particle, related by incom- 
plete-D3 symmetry, with only one ona ‘south tropic’ position and three 
each around the ‘north tropic; ‘north pole and ‘south pole’ positions 
(Fig. 1d and Supplementary Video 2). 

Each TEC is surrounded by rod-like densities with lengths up to ~650 A 
(Fig. la—c, e-f, Extended Data Fig. 4a and Supplementary Video 1). 


In most regions, these rods form parallel striations with an inter-rod 
distance of ~27 A, as suggested previously (for example, refs 10, 11). 
Some of the rods exhibit the characteristic minor and major grooves 
typical of dsRNA duplex (Fig. 1g). We therefore interpret these rod-like 
densities as dsRNA duplexes. Unlike the model of each genome segment 
spiralling around one TEC (ref. 12), the duplexes do not spiral locally 


RNA 
(no TEC) b 


3-fold 
(Earth’axis) 


180° 
NA (no TEC) ” 


North pole 


Northern tropic 
(H =36 A) 
Southern tropic 

A) 


ms 
Ce 


2-fold, DC = 36 A |2-fold, DC = =36 A}2-fold, DC = -75 A 


(radius: A) 


Figure 1 | Transcription enzyme complex and dsRNA genome 
organization inside CPV. a, Superposition of the high-resolution (3.9 A) 
map of half a capsid (grey) and low-resolution (22 A) map of dsRNA genome 
(radially coloured as in f) and TECs (cyan). b, ¢, Front (b) and back (c) views 
of the dsRNA genome and TECs of panel a. d, Earth-like representation, 
illustrating the locations of the ten TECs (surface-rendered) with pseudo-D3 
symmetry: three on each pole and the northern tropic but only one on the 
southern tropic. e, Cross-sections of the 22 A density map, perpendicular 

to either the ‘earth axis’ in d (top row) or a D3 two-fold axis (bottom row). 
Densities of TECs are numbered as in d, and the two vertices without TEC 
but with RNA are indicated by white arrows. DC, distance from centre. 

f, Boxed region in a containing RNA threads (radially coloured as in the bar) 
and a TEC (cyan) with bound dsRNA (dashed box). g, Averaged TEC region, 
filtered to 4.5 A and viewed as the southern-most TEC of a. The RdRP- 
bound dsRNA has the same structure in all TECs and shows major (yellow 
arrows) and minor (white arrows) grooves. 


1California Nanosystems Institute, University of California, Los Angeles, California 90095, USA. @Department of Microbiology, Immunology and Molecular Genetics, University of California, 
Los Angeles, California 90095, USA. *Bioengineering, University of California, Los Angeles, California 90095, USA. “Subtropical Sericulture and Mulberry Resources Protection and Safety 
Engineering Research Center, Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 


Guangdong 510642, China. 
*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 531 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


N-terminal 
domain 


‘ai LAE ES (fy )\ 
Bo ON we BO 
Figure 2 | Averaged TEC map at 3.3 A resolution and de novo modelling 
of VP4. a, Averaged map of the TEC region showing VP4 (cyan) and RdRP 
(purple), both anchored to the inner surface of the capsid (grey). b, Atomic 
model of VP4. c-f, The boxed regions in b, showing density (meshes) 
superposed with atomic models of the GTP-binding site (c), a loop (d), 
a helix (e) and an RdRP-interacting loop (f). 


around TECs (Fig. la—c, Extended Data Fig. 4 and Supplementary 
Video 1); instead, many extend tangentially from one TEC to another 
(for example, duplexes i-iii in Fig. 1b; see also Extended Data Fig. 4), 
indicating that each dsRNA segment is organized beyond one TEC. 
Indeed, the whole RNA genome is organized into seven to eight 
non-concentric layers with visible connections between adjacent lay- 
ers (Fig. le and Extended Data Fig. 3). This extended organization of 
dsRNA is consistent with the rather long (~620 A) persistence length of 
dsRNA? and would reduce the energy needed for genome packaging 
and transcription. One RNA duplex (the brown one in Fig. 1g) binds 
to each of the ten TECs at the same relative position and orientation, 
suggesting that this RNA duplex is a conserved feature among the ten 
dsRNA segments. However, the organization of the remaining RNA 
duplex differs among the ten TECs (Extended Data Fig. 4g). The two 
vertices without TECs are occupied only by roughly parallel dsRNA 
densities (Fig. la-c). 

We also obtained a 3.9A resolution asymmetric reconstruc- 
tion directly from the raw images of q-CPV and subsequently used 
non-crystallographic averaging to improve the resolution to 3.3 A for 
the TEC-containing regions (see Methods and Extended Data Fig. 5). 
The averaged map retains a short (~35 A) RARP-bound dsRNA density 
(Fig. 1g) and resolves the two protein components of the TEC: VP4 
and RdRP (Fig. 2a). We built a backbone model of the RdRP-bound 
dsRNA and de novo atomic models of both VP4 and RdRP (Fig. 2c-f, 
Extended Data Fig. 6 and Supplementary Videos 3-8). VP4 and RdRP 
interact extensively (Fig. 2a) with a buried interface area of ~2,800 A’. 

VP4 appears ‘L-shaped and consists of an amino-terminal (amino 
acids 1-252) and a carboxy-terminal (253-561) domain, with two 
unresolved/flexible segments (amino acids 23-40 and 86-131) (Fig. 2a, 
band Supplementary Video 3). The N-terminal domain is formed by two 
small 8-sheets and several a-helices, and the main body of the C-terminal 
domain is a Walker-A a/B motif, a well-known NTP-binding motif found 
in the P-loop kinase family of proteins. Sequence analysis predicted an 
NTP binding site in VP4 (refs 14, 15). Indeed, the VP4 structure contains 
a GTP molecule at the predicted NTP binding site of the C-terminal 
domain (Fig. 2c and Supplementary Video 4). We thus rename the 
C-terminal domain as the NTPase domain (Fig. 2b, c). A similar fold 
was also observed in the N-terminal a/B domain of bluetongue virus 
VP4. But, remarkably, bluetongue virus VP4 is an RNA capping enzyme 
and its a/B domain does not bind GTP (ref. 16). CPV VP4 and its homo- 
logues in other dsRNA viruses have been speculated to function as an 
NTPase, as an RNA 5/-triphosphatase (RTPase) or as a helicase!*!”"®, 
Our structure supports VP4 as an NTPase but shows no interaction with 


532 | NATURE | VOL 527 | 26 NOVEMBER 2015 


dsRNA, suggesting that VP4 is unlikely to be a helicase. Whether VP4 is 
the CPV RTPase or an RdRP regulatory factor remains to be determined. 

Like other RdRP structures**”’, the CPV RdRP contains a polymer- 
ase core with finger (amino acids 349-515, 549-641), thumb (730-863) 
and palm (516-548, 642-729) subdomains (Fig. 3a). This polymerase 
core is sandwiched between the N-terminal (1-348) and C-terminal 
bracelet (864-1225) domains (Fig. 3 and Extended Data Fig. 6). A 
GTP is identified (Figs 1g and 3a) at the position equivalent to the 
cap-binding site observed in the MRV RdRP (ref. 2). Interestingly, the 
bracelet domain of q-CPV RdRP differs from that of MRV significantly, 
despite close similarities between both their polymerase core and their 
N-terminal domains. Consequently, the crystal structure of MRV RdRP 
has an open RNA template entry channel and an accessible polymerase 
active site’; while in the q-CPV RdRP, the polymerase active site is 
covered by the bracelet domain and there is no recognizable channel 
for template entry (Figs 3a and 4a and Supplementary Videos 5 and 8). 
Since q-CPV is incapable of mRNA transcription, we considered that 
these structural differences might be characteristic of conformational 
differences between bracelet-containing RdRPs in the quiescent and 
transcribing states. 

To test this hypothesis, we then determined the structure of 
actively transcribing CPV (t-CPV), obtained an averaged TEC map at 
4.0A resolution, and built atomic models of VP4 and RdRP (Fig. 3b, 
Extended Data Figs 5 and Supplementary Video 9). In t-CPV, the loca- 
tion of TECs remains the same, as do the structures of VP4 and those 
of the N-terminal and polymerase core domains of RdRP (Fig. 3a-f, 
Extended Data Figs 7-9 and Supplementary Video 10). By contrast, 
the RdRP bracelet domain undergoes major conformational change 
(Fig. 3d, e). Consistent with the above hypothesis, the in situ structure 
of the t-CPV RdRP is quite similar to the crystal structure of MRV 
RdRP in its elongation state? (Extended Data Fig. 10). 

The most significant changes of the CPV RdRP between quies- 
cent and transcribing states involve two neighbouring structural 
modules in the bracelet domain: the capsid-proximal module A 
(amino acids 1080-1140 containing helices Bal4—Ba16) and the 


1S 

AN Ps 
 \Bracelet : 
(transcribing R 


(active sife) 
CTP" 


Figure 3 | Comparison of RdRP in quiescent and transcribing states. 

a, b, Ribbon models of RdRP in quiescent (a) and transcribing (b) states. 
The latter contains fragments of RNA template (orange) and nascent mRNA 
(cyan) inside the active site (box). c-f, Superpositions of RdRP structures 
in quiescent (colour) and transcribing (grey) states shown in full (c) and as 
separate domains—N-terminal (d), polymerase (e), and bracelet (f) with 
modules A (yellow) and B (magenta) further highlighted on its right panel. 
g-i, Densities (grey) and models (ribbons and sticks) of nucleic acids in 
the active site of t-CPV RdRP. The fragments of the (—)RNA template and 
the nascent mRNA in the active site are modelled as a poly-G and poly-C, 
respectively. In h, a CTP is placed in the NTP-binding site and in i, the 
template and mRNA form RNA duplex in the active site of RdRP (surface- 
rendered model). 


© 2015 Macmillan Publishers Limited. All rights reserved 


(MRV RdRP) 


Capsid shell 


Figure 4 | Interactions between TEC and CSPs. a, b, Conformational 
changes of modules A (yellow loops/helices as wires/cylinders) and B 
(magenta loops/helices as wires/cylinders) in quiescent (a) and transcribing 
(b) states. Module A interacts with the capsid shell, and the loop-Ba5 
fragment of module B blocks the active site (inset) in the quiescent state 

(a) but retracts to expose the active site in the transcribing (b) state (see 
Extended Data Fig. 7). c, d, The RdRP-bound dsRNA (ribbon) in the 


VP4-proximal module B (912-1010 containing helices Ba5-Ba9) 
(Fig. 4a, b and Extended Data Figs 9 and 10). Compared to that in 
q-CPV, module A in t-CPV rotates ~40° towards the capsid shell (Fig. 3f 
and Extended Data Figs 9 and 10f-k). Consistent with previous icosa- 
hedral reconstructions, our asymmetric reconstructions show that the 
capsid shell of t-CPV expands outwards from q-CPYV, with the maximal 
(~10 A) expansion occurring at the vertex region”®”!, to which module 
A of the bracelet domain is attached (Fig. 4e, f). Likewise, module B 
refolds substantially from quiescent to transcribing state, such that a 
template entry channel is formed (Fig. 4a, b) and the blockage of the 
active site by the Ba5-loop-Baé fragment is removed (Figs 3f, 4a, b and 
Extended Data Figs 9 and 10). 

In the quiescent state, a helical dsRNA duplex is held inside a shal- 
low cleft formed by modules A and B (Figs 1g, 4c and Extended Data 
Fig. 7d, f) through interaction between a major groove of the RNA 
duplex and residue Arg997 of module B (Fig. 4c, inset). In the tran- 
scribing state, this RNA duplex becomes detached, perhaps as a result 
of refolding the RdRP bracelet domain (Fig. 4d and Extended Data Fig. 
7e, g). We reason that detachment of the RNA duplex would permit 
RNA to slide towards the template entry channel for RNA synthesis in 
t-CPV. Indeed, in the catalytic centre of the t-CPV RdRP, we observe 
weak densities (Fig. 3b, g-i) that match the RNA duplex in the crystal 
structure of the MRV RdRP elongation complex’. We are able to place 
a 5-base-pair (bp) RNA backbone model in the active site and a CTP 
at the NTP binding site (Fig. 3g-i). 

In addition to enclosing the viral genome and anchoring TECs, the 
CSP also regulates polymerase activity in dsRNA viruses**~”». In par- 
ticular, the CSP N-terminal fragment is involved in genome replication, 
mRNA transcription and capping”>”®?”. A CSP N-terminal fragment, 
unresolved in all previous structures”!?*-9, is resolved here to form 


LETTER 


Template 
entry channel 


Tw 


a 
(transcribing RdRP) 


e, f, Interactions of CSPs (ribbons) with RdRP (purple and yellow) and VP4 
(cyan). Residues of RdRP and VP4 within 4 A distance to the capsid shell are 
marked in red. An icosahedral five-fold axis is indicated by a green line in 

e and a green pentagon in f. Insets in f indicate two CSP N-terminal helices 
(white density with ribbon-and-stick models): one (upper) interacts only 
with RdRP while the other (lower) with both RdRP and VP4. 


a helix in the two TEC-interacting CSP subunits in both q-CPV and 
t-CPV (Fig. 4e, f). The N-terminal helix of one CSP inserts into the 
interface between the NTPase domain of VP4 and the finger subdo- 
main of RdRP (Fig. 4f, lower inset), and that of the other CSP interacts 
with the bracelet domain of RdRP (Fig. 4f, upper inset). Notably, the 
former is in proximity to the NTP-binding site of the VP4 NTPase, 
suggesting how the N-terminal fragment of CSP is positioned to 
affect TEC. In addition, the structures reveal that other regions (that 
is, areas under the vertex) of CSP also interact with module A of the 
RdRP bracelet domain (Fig. 4e, f). From quiescent to transcribing 
state, module A and the CSP regions involved in this interaction both 
undergo conformational changes. Taken together, these results point 
to a sequence of conformational changes that leads to activation of 
endogenous transcription. Specifically, environmental cues cause the 
capsid shell to expand”’, which triggers refolding of the RdRP bracelet 
domain, leading to formation of the entry channel for a RNA template 
and exposure of the polymerase active site for RNA synthesis. 

Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 

Received 8 June; accepted 7 October 2015. 

Published online 26 October 2015; corrected online 25 November 2015 

(see full-text HTML version for details). 


1. Mertens, P. P.C., Rao, S. & Zhou, Z. H. in Virus Taxonomy, Vilith Report of the ICTV 
(eds Fauquet, C. M. et a/.) 522-533 (Elsevier/Academic Press, 2004). 

2. Tao, Y., Farsetta, D. L., Nibert, M. L. & Harrison, S. C. RNA synthesis in a 
cage—structural studies of reovirus polymerase \3. Cel/ 111, 733-745 (2002). 

3. Lu, X. et al. Mechanism for coordinated RNA packaging and genome 
replication by rotavirus polymerase VP1. Structure 16, 1678-1688 (2008). 

4. Jiang, W. et al. Structure of epsilon15 bacteriophage reveals genome organization 
and DNA packaging/injection apparatus. Nature 439, 612-616 (2006). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 533 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


15. 
16. 


17. 


18. 


19. 
20. 
21. 
22. 


Lander, G. C. et al. The structure of an infectious P22 virion shows the signal 
for headful DNA packaging. Science 312, 1791-1795 (2006). 

Zhou, Z. H. in Segmented Double-Stranded RNA Viruses: Structure and Molecular 
Biology (ed. Patton, J. T.) 27-43 (Caister Academic Press, 2008). 

Furuichi, Y. “Methylation-coupled” transcription by virus-associated 
transcriptase of cytoplasmic polyhedrosis virus containing double-stranded 
RNA. Nucleic Acids Res. 1, 809-822 (1974). 

Estrozi, L. F. et al. Location of the dsRNA-dependent polymerase, VP1, in 
rotavirus particles. J. Mol. Biol. 425, 124-132 (2013). 

Zhang, X., Walker, S. B., Chipman, P. R., Nibert, M. L. & Baker, T. S. Reovirus 
polymerase }3 localized by cryo-electron microscopy of virions at a resolution 
of 7.6A. Nature Struct. Mol. Biol. 10, 1011-1018 (2003). 

jason, E. L. et a/. Interactions between the inner and outer capsids of 
bluetongue virus. J. Virol. 78, 83059-8067 (2004). 


. Xia, Q., Jakana, J., Zhang, J.-Q. & Zhou, Z. H. Structural comparisons of empty 


and full cytoplasmic polyhedrosis virus protein-RNA interactions and 
implications for endogenous RNA transcription mechanism. J. Biol. Chem. 
278, 1094-1100 (2003). 


. Gouet, P. et a/. The highly ordered double-stranded RNA genome of bluetongue 


virus revealed by crystallography. Ce// 97, 481-490 (1999). 


. Abels, J., Moreno-Herrero, F., Van der Heijden, T., Dekker, C. & Dekker, N. 


Single-molecule measurements of the persistence length of double-stranded 
RNA. Biophys. J. 88, 2737-2744 (2005). 


. Nibert, M. L. & Kim, J. Conserved sequence motifs for nucleoside triphosphate 


binding unique to turreted Reoviridae members and coltiviruses. J. Virol. 78, 
5528-5530 (2004). 

Zhao, S., Liang, C., Hong, J. & Peng, H. Genomic sequence analyses of 
segments 1 to 6 of Dendrolimus punctatus cytoplasmic polyhedrosis virus. 
Arch. Virol. 148, 1357-1368 (2003). 

Sutton, G., Grimes, J. M., Stuart, D. |. & Roy, P. Bluetongue virus VP4 is 

an RNA-capping assembly line. Nature Struct. Mol. Biol. 14, 449-451 
(2007). 

Stauber, N., Martinez-Costas, J., Sutton, G., Monastyrskaya, K. & Roy, P. 
Bluetongue virus VP6 protein binds ATP and exhibits an RNA-dependent 
ATPase function and a helicase activity that catalyze the unwinding 

of double-stranded RNA substrates. J. Virol. 71, 7220-7226 

(1997). 

Kim, J., Parker, J. S., Murray, K. E. & Nibert, M. L. Nucleoside and RNA 
triphosphatase activities of orthoreovirus transcriptase cofactor 2. J. Biol. 
Chem. 279, 4394-4403 (2004). 

Choi, K. H. & Rossmann, M. G. RNA-dependent RNA polymerases from 
Flaviviridae. Curr. Opin. Struct. Biol. 19, 746-751 (2009). 

Yang, C. et al. Cryo-EM structure of a transcribing cypovirus. Proc. Nat! Acad. 
Sci. USA 109, 6118-6123 (2012). 

Yu, X., Jiang, J., Sun, J. & Zhou, Z. H. A. putative ATPase mediates RNA 
transcription and capping in a dsRNA virus. Elife 4, e07901 (2015). 
Luongo, C. L. et al. Loss of activities for mRNA synthesis accompanies loss of 
\2 spikes from reovirus cores: an effect of \2 on A1 shell structure. Virology 
296, 24-38 (2002). 


534 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


23. Patton, J. T., Jones, M. T., Kalbach, A. N., He, Y.-W. & Xiaobo, J. Rotavirus RNA 

polymerase requires the core shell protein to synthesize the double-stranded 

RNA genome. J. Virol. 71, 9618-9626 (1997). 

24. Mansell, E. A. & Patton, J. T. Rotavirus RNA replication: VP2, but not VP6, is 

necessary for viral replicase activity. J. Virol. 64, 4988-4996 (1990). 

25. Gridley, C. L. & Patton, J. T. Regulation of rotavirus polymerase activity by inner 

capsid proteins. Curr. Opin. Virol. 9, 31-38 (2014). 

26. McDonald, S. M. & Patton, J. T. Rotavirus VP2 core shell regions critical for viral 

polymerase activation. J. Virol. 85, 3095-3105 (2011). 

27. Starnes, M. C. & Joklik, W. K. Reovirus protein 3 is a poly (C)-dependent poly 

(G) polymerase. Virology 193, 356-366 (1993). 

28. Reinisch, K. M., Nibert, M. L. & Harrison, S. C. Structure of the reovirus core at 
3.6A resolution. Nature 404, 960-967 (2000). 

29. Grimes, J. M. et al. The atomic structure of the bluetongue virus core. Nature 
395, 470-478 (1998). . 

30. Yu, X., Jin, L. & Zhou, Z. H. 3.88 A structure of cytoplasmic polyhedrosis virus 
by cryo-electron microscopy. Nature 453, 415-419 (2008). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported in part by grants from the 
National Institutes of Health (Al094386 and GMO71940 to Z.H.Z.), NSFC 
(31172263 to J.S.) and NSFGD (S2013010016750 to J.S.). We acknowledge 
the use of instruments at the Electron Imaging Center for Nanomachines 
supported by UCLA and by instrumentation grants from NIH (1S10RR23057, 
1S100D018111) and NSF (DBI-1338135). We thank P. Ge for carrying out a 
Relion reconstruction without an initial model as an independent verification 
step, S. Schein and L. Wang for proof-reading the paper, and P. Afonine for 
model refinement. 


Author Contributions Z.H.Z. supervised research; X.Z., X.Y. and Z.H.Z. 
designed and performed the experiments, analysed and interpreted data and 
wrote the paper; K.D. wrote programs, analysed data and prepared figures; 
W.C. built models; J.S. prepared reagents. All authors reviewed and finalized 
the paper. 


Author Information 3D cryo-electron microscopy (cryoEM) density maps 
have been deposited in the Electron Microscopy Data Bank under the 
accession numbers EMD-6408 (3.3A averaged TEC in q-CPV), EMD-6404 
(4.0A averaged TEC in t-CPV), EMD-6407 (3.9A full q-CPV), EMD-6405 
(4.8A full t- CPV), EMD-6406 (5.1 A asymmetric reconstruction of q-CPV 
by capsid-subtraction method) and EMD-6409 (filtered 22 A q-CPV 
asymmetric reconstruction). The coordinates of atomic models of the 
TEC in q-CPV and t-CPV have been deposited in the Protein Data Bank 
under accession number 3JB6 and 3JB7, respectively. Reprints and 
permissions information is available at www.nature.com/reprints. The 
authors declare no competing financial interests. Readers are welcome 
to comment on the online version of the paper. Correspondence and 
requests for materials should be addressed to Z.H.Z. (Hong.Zhou@ucla.edu) 
or J.S. (cyfz@scau.edu.cn, for reagents). 


METHODS 


No statistical methods were used to predetermine sample size. 

Sample preparation and cryoEM imaging. CPV particles were purified as 
described previously’. Purified polyhedra were treated at pH 10.8 with an alka- 
line solution (0.2 M Na,CO3-NaHCOs) for 1h, and then centrifuged at 10,000g 
for 40 min. The supernatant was collected and centrifuged at 80,000g for 60 min at 
4°C to pellet the CPV virions. The resulting pellet was directly re-suspended in the 
quiescent buffer (70 mM pH 8.0 Tris-Cl, 10 mM MgCh, 100mM NaCl and 2mM 
GTP). To prepare the transcribing CPV (t-CPV) particles, 30 ul purified CPV was 
incubated in a reaction buffer (70 mM Tris, pH 8.0, 10 mM MgCh, 100mM NaCl, 
and 1mM SAM+2mM GTP+2mM UTP+2mM CTP+ 4mM ATP) at 31°C for 
15 min, and then the reaction was stopped by quenching the reaction tubes on ice. 

To prepare cryoEM grids, 2.5 ul of purified CPV sample was applied to a 
Quantifoil grid (2/2), blotted for 15s with an FEI vitrobot in 100% humidity, and 
then plunged into liquid ethane. CryoEM images of the quiescent CPV (q-CPV) 
were collected in an FEI Titan Krios cryo electron microscope, operated at 300kV 
with a nominal magnification of 49,000 x (Extended Data Fig. 5g). The microscope 
was carefully aligned and electron beam tilt was minimized by a coma-free align- 
ment procedure. Images were recorded on a Gatan K2 direct electron detection 
camera with the counting mode, and the pixel size was calibrated as 1.01 A per 
pixel on the specimen using catalase crystals. The dose rate of the electron beam 
was set to ~8e~ per pixel per s, and the image stacks were recorded at 4 frames s! 
for 3s. The drift between frames in each image stack was corrected with the UCSF 
software*!, and the total 12 frames of each stack were merged to generate a final 
image with a total dose of ~25e7 A~. Contrast transfer function (CTF) param- 
eters, including defocus values and astigmatism, were determined by CTFIND” 
(Extended Data Fig. 5g). 

Sample grid preparation, cryoEM imaging and drift correction of frames for the 

transcribing CPV (t-CPV) were performed using the same procedure described 
above for q-CPV with the exception of the camera used. The t-CPV cryoEM images 
were recorded on a new Gatan K2 direct electron detection camera attached to a 
Gatan imaging filter (GIF Quanta) with a pixel size of 1.36 A at the specimen scale 
(Extended Data Fig. 5g). 
Asymmetric reconstruction based on original images. A total of 68,526 particles 
were selected for image processing using Frealign* and Relion*4. The 2x binned 
data set was first processed using icosahedral symmetry with Frealign*’. The cen- 
tres of all particles were then fixed and used for the asymmetrical global search 
with Frealign using 4x binned data set starting at 20 A resolution. 

To generate an initial model, we placed the crystal structure of the MRV RdRP? 
under a previously obtained CPV capsid map” at the location corresponding to 
that in MRV capsid as previously reported’ and imposed a tetrahedral symmetry 
(that is, with 4 three-fold axes, 3 two-fold axes and 12 asymmetric units), resulting 
in a montage map with an empty CPV capsid containing 12 RdRPs but without 
any VP4. This montage map was filtered to 30 A resolution and used as the initial 
model for image processing with Frealign. After 9 iterations of global search and 
2 iterations of refinement, the resolution of the density map was determined to be 
3.9 A. In the final map, only 3 RdRPs (numbers 8-10 in Fig. 1d) remained at the 
same locations as in the initial model with the tetrahedral symmetry. 

The final map was reconstructed using the top 47,968 (70%) particles of the 
original unbinned data set. Averaging all TEC densities under different vertices 
was performed following the procedure described previously*® to improve the 
density quality and the resolution. The effective resolution of the asymmetrical 
and averaged reconstructions were estimated to be 3.9 A and 3.3 A, respectively, 
based on the FSC (>0.143) and the correlation coefficient (>0.5) between the 
density map and atomic model calculated with Phenix (Extended Data Fig. 5g)°”°8. 
These estimated resolutions are consistent with the observed structural features of 
the density maps (Fig. 2, Extended Data Fig. 5e and Supplementary Videos 3-8). 
The averaged map was filtered to the spatial frequency of 1/(3.3 A) and sharpened 
with a reverse B-factor of -120 A”. This B-factor was chosen with a trial-and-error 
method based on the optimization of noise level, backbone density continuity, and 
emergence of side-chain densities. 

Since there were no densities in the initial montage model at the VP4 locations, 
the emergence of VP4 densities in the map and the match of side-chain densities to 
those expected from the VP4 amino acid sequence (Fig. 2) provide strong internal 
controls for the validity of the high resolution cryoEM map. Consistent with this 
assessment, the locations of the RdRP in the final reconstruction are not only dif- 
ferent from those in the initial montage model, but also are related by D3 symmetry 
instead of the tetrahedral symmetry in the initial model. Most convincingly, the 
density features in the final map agree with the CPV RdRP amino acid sequence 
but differ from that of the MRV RdRP used in the initial model. 

In addition, we also performed independent reconstruction without using the 
model of the 12 MRV RdRPs, and obtained a nearly identical structure from the 


LETTER 


same data set. In this procedure, we first determined an icosahedral reconstruc- 
tion without using any initial models. This icosahedral reconstruction was used 
to restrain refinement without symmetry (that is, symmetry operator is C1) to 
search for orientation around the 60 icosahedral-symmetry-related locations with 
Relion**. This independent result further validates our TEC structures. 

To obtain the 3D structure of the transcribing particles, we low-pass filtered the 
above 3D map of q-CPV to 30A resolution and used it as the initial model. After 
11 iterations of asymmetrical global search and 2 iterations of local refinement, 
the density map converged to a resolution of 4.8 A, and the density quality of the 
TEC was further improved to ~4.0 A resolution by aligning and averaging all TEC 
densities inside the asymmetric reconstruction (Extended Data Fig. 5d, f, g). 
Asymmetric reconstruction using capsid-subtracted images. To improve the 
genome structure further, we used the following procedure to carry out asymmetric 
reconstruction of q-CPV with the same particle image data set but with capsid 
contribution subtracted. As illustrated in Extended Data Fig. 1, this procedure 
includes four stages: 1, capsid subtraction in raw particle (orange); 2, initial model 
generation (green); 3, asymmetric feature emergence in Relion™ refinement (blue); 
4, orientation selection (purple). 

In the first stage (orange in Extended Data Fig. 1), we determined the orienta- 
tion and centre parameters for each particle and obtained an icosahedral recon- 
struction with Frealign®* from raw particles with an inverse B-factor of — 40 A” 
(Extended Data Fig. 1a, b). On the basis of these parameters, a CTF-corrected 
projection (Extended Data Fig. 1c) with empirical B-factor of 160 A? was generated. 
Next, the capsid contribution to the images was removed by subtracting the 2D 
projection corresponding to the icosahedral orientation of each image as done 
before?’ with the following improvements. To subtract the contribution from 
the capsid accurately, we determined a scaling factor between capsid projection 
(Extended Data Fig. 1c) and each raw particle image (Extended Data Fig. 1a). 
The projection and raw images were both band-pass filtered between 1/400 A~! 
and 1/29 A~!, then radially masked based on the inner and outer diameters of cap- 
sid to produce ring-shaped projections (Extended Data Fig. 1d) and raw (Extended 
Data Fig. le) images. The standard deviations of these ring-shaped images were 
calculated and used to normalize both the unmasked and masked (that is, ring- 
shaped) projections. The cross-correlation coefficient (0 tol) between the ring- 
shaped raw image and the normalized ring-shaped projection was computed and 
used as the probability factor measuring the contribution of capsid signal in the 
raw particle image. Each raw image was then subtracted by the unmasked projec- 
tion multiplied by this probability factor to generate a capsid-subtracted particle 
(Extended Data Fig. 1f) for the following refinement. Particles with a probability 
factor less than 0.1 were not included in the subsequent analyses. 

In the second stage (green in Extended Data Fig. 1), the map from the above 
Frealign asymmetric refinement (Extended Data Fig. 1g) was low-pass filtered to 
60A resolution, masked with a 260 A radius (Extended Data Fig. 1h), and used 
to refine the capsid-subtracted particle (Extended Data Fig. 1f) with Relion ver- 
sion 1.2. The Tau2_fudge value (T-factor) in Relion was set to 0.5. T-factor is an 
ad hoc value in Relion to tune refinement speed, and a value of 0.5 slowed down 
the refinement progression, thus ensuring the priority use of low resolution (up 
to 20 A, such as dsRNA) data in the refinement. This refinement led to a recon- 
struction without the capsid (Extended Data Fig. 1i). This capsid-removed map 
has 12 TECs with D3 symmetry, which could be classified into two groups: the 
first group containing six better-resolved TECs close to the three-fold axis (polar), 
and the other group containing six less-resolved TECs near the equator (tropical), 
suggesting potential smear of density due to orientation mis-assignments or TEC 
flexibility/lower occupancy near the equator. 

To eliminate potential orientation mis-assignments further, we next conducted 
the third stage of data processing (blue in Extended Data Fig. 1). We first low-pass 
filtered the capsid-removed reference (Extended Data Fig. 1i) to 32 A resolution 
(Extended Data Fig. 1j) and used it to drive Relion refinement with the capsid-sub- 
tracted particles (Extended Data Fig. 1f). The T-factor used in this refinement is 
0.1, only 2.5% of that used in Relion convention, thus ensuring slow progression 
of the refinement. Slower refinement provides time for asymmetrical feature to 
emerge. Relion global search was carried out with a 3.75° degrees angular inter- 
val, followed by local angular search with 1.875° interval and highly constrained 
translational search (0.7 pixel in range with 0.5 pixel interval). Asymmetrical RNA 
density feature with ten TECs emerged after ten iterations (Extended Data Fig. 1k). 
In our procedure, one way to prevent trapping into local minima in orientation 
assignment due to symmetric structural elements is to filter the current refinement 
result back to ~32 A resolution and refine with T-factor of 0.1 again to remove 
residual symmetric feature from the working reference. This process is carried 
out iteratively. 

To improve resolution of the 3D map further, we carried out the fourth stage 
for particle orientation selection (purple in Extended Data Fig. 1). From the 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


orientation of each particle determined in the high-resolution (~3 A) icosahedral 
reconstruction (Extended Data Fig. 1b), we calculated 60 icosahedral-related ori- 
entation candidates. The task of the rest of the fourth stage of data processing is to 
select one out of these 60 orientation candidates to be the asymmetric orientation 
of the particle as done before****. To do this, we continued to run Relion refine- 
ment for 15 iterations using the above asymmetric map with 10 TECs (Extended 
Data Fig. 1k) as initial model and the orientation determined by each iteration was 
recorded, giving rise to 15 Relion orientations for each particle. For each of these 15 
Relion orientations, we calculated its angular distances to the 60 icosahedral-related 
orientation candidates, and the icosahedral-related orientation candidate with the 
smallest angular distance was selected as the working orientation for that iteration, 
resulting in a total of 15 working orientations for each particle. The particle would 
be retained if 14 or all of its 15 working orientations are the same (that is, the 
selected orientation) and their averaged angular distance was less than 3 degrees. 
Otherwise, this particle will be discarded. This procedure yielded a total of 11,741 
particles with selected orientation. The original raw images of these selected parti- 
cles were combined to generate an asymmetric reconstruction using Frealign and 
the resolution was determined to be 5.1 A. 

As shown in Extended Data Fig. 2, this procedure was repeated by using a 
Gaussian ball to replace the capsid + TEC model (Extended Data Fig. 1g) in the ini- 
tial model generation stage (green in Extended Data Fig. 1). The result is the same, 
confirming that our procedure was not influenced by the choice of initial model. 
Atomic modelling and visualization. The atomic models of both RdRP and VP4 
in the quiescent state were built with Coot and refined with Phenix**, as described 
previously”. 

The atomic model of the VP4 structure was manually built with Coot. Because 
no homology models of VP4 previously existed, the Ca carbon backbone was 
constructed by matching the VP4 amino acid sequence to the density map. Once 
the correct placement of each residue was ensured, the backbone was converted 
to a purely alanine backbone by the function ‘Mainchain, and mutated to the cor- 
responding amino acids through the function ‘Mutate Residue Range’. With the 
initial model now completed, the ‘Density Fit Analysis’ validation tool was used 
to screen for sequences of the model that did not fit the density. When identified, 
these sequences and the amino acids surrounding them were examined for any 
other possible conformations that would better fit the density. Owing to the high 
resolution of this structure, this was completed through the refinement tool ‘Real 
Space Refine Zone; which optimizes the fit of the model to the mass density while 
preserving stereochemistry. Additionally, refinement was also performed based 
on the Ramachandran plot, an important indicator of three-dimensional protein 
structure that validates the torsion angles of a protein chain. In the Ramachandran 
plot, any residues with disallowed values were selected, and the stereochemistry 
of that residue along with its surrounding residues was optimized with the refine- 
ment tool ‘Regularize Zone. After ideal Ramachandran values were obtained (<1% 
outliers), the refinement function ‘Rotamers’ was used to select a rotamer that 
best fit the density. 

The atomic model of the polymerase structure was also manually built with 
Coot. However, since an atomic model for the MRV polymerase was available in 


the Protein Data Bank (accession number 1 MUK), this model was used as a tem- 
plate to assist with model building through the identification of the N terminus, 
C terminus and various secondary structures. Once the Ca carbon backbone was 
built by matching the polymerase amino acid sequence to the density map and 
mutated to the appropriate amino acids, the model was refined with ‘Regularize 
Zone; ‘Rotamers’ and ‘Real Space Refine Zone. The model was validated with 
the Ramachandran plot and the function ‘Density Fit Analysis. The complex 
of VP4 and polymerase was then refined with Phenix, including the real space 
refinement*®. 

The atomic models of the transcribing state were built by fitting the atomic 
structures of RdRP and VP4 at quiescent state into the density, manually adjusting 
the changed residues with Coot, and refining the models with Phenix*. 

Visualization, segmentation of density maps, and generation of videos were 
done with UCSF Chimera*!. 


31. Pettersen, E. F. et al. UCSF Chimera-a visualization system for exploratory 
research and analysis. J. Comput. Chem. 25, 1605-1612 (2004). 

32. Mindell, J. A. & Grigorieff, N. Accurate determination of local defocus and 
specimen tilt in electron microscopy. J. Struct. Biol. 142, 334-347 (2003). 

33. Lyumkis, D., Brilot, A. F., Theobald, D. L. & Grigorieff, N. Likelinood-based 
classification of cryo-EM images using FREALIGN. J. Struct. Biol. 183, 377-388 
(2013). 

34. Scheres, S.H. RELION: implementation of a Bayesian approach to cryo-EM 
structure determination. J. Struct. Biol. 180, 519-530 (2012). 

35. Yu, X., Ge, P,, Jiang, J., Atanasov, |. & Zhou, Z. H. Atomic model of CPV reveals 
the mechanism used by this single-shelled virus to economically carry out 
functions conserved in multishelled reoviruses. Structure 19, 652-661 
(2011). 

36. Zhang, X. et al. Near-atomic resolution using electron cryomicroscopy and 
single-particle reconstruction. Proc. Natl Acad. Sci. USA 105, 1867-1872 
(2008). 

37. Wolf, M., Garcea, R. L., Grigorieff, N. & Harrison, S. C. Subunit interactions in 
bovine papillomavirus. Proc. Natl Acad. Sci. USA 107, 6298-6303 (2010). 

38. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for 
macromolecular structure solution. Acta Crystallogr. D 66, 213-221 (2010). 

39. Huiskonen, J. T., Jaalinoja, H. T., Briggs, J. A., Fuller, S. D. & Butcher, S. J. 
Structure of a hexameric RNA packaging motor in a viral polymerase complex. 
J. Struct. Biol. 158, 156-164 (2007). 

40. Briggs, J. A. et al. Classification and three-dimensional reconstruction of 
unevenly distributed or symmetry mismatched features of icosahedral 
particles. J. Struct. Biol. 150, 332-339 (2005). 

41. Booy, F. et al. Liquid-crystalline, phage-like packing of encapsidated DNA in 
herpes simplex virus. Cell 64, 1007-1015 (1991). 

42. Zhang, Y., Kostyuchenko, V. A. & Rossmann, M. G. Structural analysis of viral 
nucleocapsids by subtraction of partial projections. J. Struct. Biol. 157, 
356-364 (2007). 

43. Tao, Y. et al. Assembly of a tailed bacterial virus and its genome release studied 
in three dimensions. Cel! 95, 431-437 (1998). 

44. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. 
Acta Crystallogr. D 60, 2126-2132 (2004). 

45. Zhang, X. et al. A new topology of the HK97-like fold revealed in Bordetella 
bacteriophage by cryoEM at 3.5 A resolution. Elife 2, e€01299 (2013). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Initial model generation 


Capsid subtraction 
a, W part! (g) Capsid+TEC 


Conventional 
refinement 


Band-pass (h) Capsid Low-pass 
filtering spherically- filtering to 60A 
masked ref 


Band-pass 
filtering 


Capsid- 
subtracted 
article 


Subtracting projection 
from raw 


Orientation selection 


initial model 
(m) 5.1A map of virion 


Reference 
refinement 


Asymmetrical feature 


sajyepipued juajeainba 99 
9} e19Ua38 JIM YoIYM ‘pajoes}xa Ss! UOI}e}UaIO pauljay 


Extracting stable emergence 
(I) Orientation and accurate tt eae 
distribution orientations 


if-emerged 10 TEC ref setting T-factor 
0.1 in RELION, 
catching 
asymetric 
feature up to 
23A 


Selection based on angular distance to 
60 candidates among iterations 


If symmetrical feature remains after several iterations, low-pass filter back to 32A__ 
_and refine again to reduce local minima iteratively 


Extended Data Figure 1 | Illustration of the asymmetric reconstruction procedure using particles with the capsid density subtracted. See Methods for 
full explanation of panels a~m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


Run 1 


@ 

xe) 

Oo 

= 

c 

£ 
35-iteration 20-iteration 31-iteration 
refinemen refinement efinement 


Band-pass filter (100A to 30A) 
Band-pass filter (100A to 30A) 


Extended Data Figure 2 | Validation of asymmetric reconstruction from capsid-subtracted images using a Gaussian ball as the initial model. Arrows 
linking a to f represent the progression of the procedure. The top panels (a, c, e) show the input model for each run and the bottom panels (b, d, f) show the 
output of each run. 


© 2015 Macmillan Publishers Limited. All rights reserved 


MEG RESEARCH 


Extended Data Figure 3 | Sections of the q-CPV density map along the three-fold (that is, the earth axis) (a) and two-fold (b) axes of the pseudo-D3 
symmetry. Note the lack of three-fold and two-fold symmetry in the RNA density in contrast to the perfect symmetry of the capsid shell proteins. Pixel 
size = 4.04 A; clipped map size = 166 x 166 x 120 pixels. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


North pole 


A hy = 
om. : a 


Southern Northern North Pole 


South Pole 


Extended Data Figure 4 | dsRNA density maps in the quiescent state. are arranged and numbered according to Fig. 1d. First row, TECs 1, 2, 3; 

a, View of TEC + RNA densities with the same orientation of Fig. 1d. second row, TECs 4, 5, 6; third row, TEC 7 and two unoccupied positions; 

b, c, The same view as in a but rotated by +90° (b) or — 90° (c) along x axis fourth row, TECs 8, 9, 10. All TECs have a dsRNA segment bonded at the 

in panel a to view from either north (b) or south (c) poles. d-f, Three views flange, each marked with a black arrow. Unlike the polar TECs, each tropical 
from three two-fold axes on the equator, each is rotated by 120° along the TEC (4-7) is surrounded by an extra density rod (open arrow). 


y axis from each other. g, dsRNA density maps at the twelve vertices. TECs 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


‘o) 
jos 


a 018 al 
5 5 
3 3 4 
= = 
8 8 0.6 4 
5 8 
& & Al 
ro J 
FS = 

6 04 a 
8 8 
n n 
8 g “| 
36 13) 


o2- transcribing 
0.143h------------------------- ~~ -- Xe -- 4 


1/20 1/10 7 15 V4 113.5 113.2 1/20 1/10 VW 15 Wa 18.5 
Spatial Frequency Spatial Frequency 
g£ Quiescent state Transcribing state 
Sample buffer (pH8.0) 70mM pH 8.0 Tris-Cl, 10 70mM Tris, pH 8.0, 10 mM 
mM MgCl2, 100 mM NaCl MgCl2, 100 mM NaCl, and 
and 2 mM GTP 1 mM SAM+ 
2mM GTP+2mM UTP+ 
2mM CTP+4 mM ATP 

Microscope FEI Titan Krios FEI Titan Krios 

Voltage (kV) 300 300 

Camera Gatan K2 (counting mode) Gatan K2 (counting mode) 

Energy-filter None GIF Quantum Energy Filter 

Nominal Magnification 49,000x 105,000x 

Alpixel 1.01 1.36 

Underdefocus range (um)  0.6~4.51 0.89~3.4 

Total Dose (electrons/A?)  ~25 ~40 

Frames per micrograph 12 40 

Micrographs exposed 4385 4907 

Particles selected 68526 81887 

Resolution (A) 3.3 (averaged) 4.0 (averaged) 

FSC20.143 

Model refinement with phenix 

Rwor (overall) 0.18 (50-3.3A) 0.25 (50-4.0A) 

Rrree (overall) 0.17 (50-3.3A) 0.23 (50-4.0A) 

Rwork (best resolution zone) 0.37 (3.6-3.3A) 0.35 (4.6-4.0A) 

Rrree (best resolution zone) 0.36 (3.6-3.3A) 0.32 (4.6-4.0A) 

Ramachandran plot values 

Most favored 97.57% 95.91% 

Generously allowed 2.38% 3.91% 

— Disallowed regions 0.06% 0.18% 
transcribing 

Extended Data Figure 5 | CryoEM reconstructions of CPV in the reconstructions of capsid + genome and the locally averaged TEC densities, 
quiescent and transcribing states. a, b, CryoEM images of CPV particles respectively. The effective resolutions of the local averaged maps are ~3.3 A 
in quiescent (a) and transcribing (b) states. These images were obtained (c) and ~4.0 A (d) resolution (FSC>0.143) for maps in the quiescent 
by aligning and averaging frames in direct electron counting image stacks. and transcribing, respectively. e, f, CryoEM densities (grey surface 
Fibre-like nascent mRNAs are visible over background in b (marked representations) superimposed with atomic models (ribbons and sticks) 
by green arrows), while the background in a is clean. c, d, Fourier shell for the quiescent (e) and transcribing (f) states. The a-helix (Pa12) and the 
correlation coefficients (FSCs) as a function of spatial frequency between four-stranded B-sheet (P4, P7—-8 and P11) in e and fare both from the palm 
two half maps for reconstructions in the quiescent (c) and transcribing subdomain of the polymerase domain at 3.3 A (e) and ~4.0A (f) resolutions. 
(d) states. The black and red lines represent FSCs for the asymmetrical g, Statistics of CPV reconstructions and atomic model refinement. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Naw Nes Newt N1 N2 NaS Na6é 


1 10 20 30 40 50 60 70 80 90 100 
MLPNTELYNTIFSETRKFTRESFKE!EHLTAKLANDRVARHDFLFENNS|!ALISDYSGEDSNGNQLQATVTIPNEI TNPKEYDPSDYPLAEDESF FKQGHK 


N3 N4 


110 120 130 140 150 160 170 180 190 200 
YDYLVTFRAGSLTNTYEPKTKMYKLHAALDKLMHVKQRKSRFADLWRELCAV!IASLDVWYQTTNYPLRTYVKLLFHKGDEFPFYESPSQDKI!1FNDKSVA 
Na10 Na11 Na12 Na13 N N6 Na14 


210 220 230 240 250 260 270 280 290 300 
SILPTFVYTCCQVGTAIMSGILTHVES 1 VAMNHFLHCAKDSY I DEKLKIKGIGRSWYQEALHNVGRATVPVWSQFNEVIGHRTKTTSEPHFVSSTFISLR 
Nat5 Nai6 Pat P14 Pa2 


310 320 330 340 350 360 370 380 390 400 
AKRAELLYPEFNEY I NRALRLSKTQNDVANY YAACRAMTNDGTFLATLTELSLDAAVFPRIEQRLVTRPAVLMSNTRHESLKQKYANGVGS | AQSYLSSF 
Pa3 Pa4 Pad P P3. Pa6 


421 430 440 450 460 470 480 490 500 
TDE!AKRVNGIHHDEAWLNFLTTSSPGRKLTE!lEKLEVGGDVAAWSNSR I! VMQGAVFAREYRTPERI! FKSLKAP|IKLVERQQSDRRQRAISGLDNDRLFLS 


Pa7 Pa8 P4 PaQ Pa10 P5 P6 


510 520 530 540 550 560 570 580 590 600 
FMPYT1GKQI YDLNDNAAQGKQAGNAFD1GEMLYWTSQRNVLLSS | DVAGMDASVTTNTKDI YNTFVLDVASKCTVPRFGPY YAKNMEVFEVGKRQSQVK 


Pai1 Pai2 P7 P8& Pat3 


610 620 630 640 650 660 670 680 690 700 
YVNAAWGACALEAANSQTSTSYESE I FGQVKNAEGTYPSGRADTSTHHTVLLQGLVRGNELKRASDGKNSCLTT1IKILGDDIMEl FQGNENDTHDHAVSN 


P9. P10 P11 P12 Pa14 Pa15 P13 


710 720 730 740 750 760 770 780 790 800 
AS | LNESGFATTAELSQNS | VLLQQLVVNGTFWGFADRISLWTREDTKDIGRLNLAMMELNAL | DDLLFRVRRPEGLKMLGFFCGAICLRRFTLSVDNKL 


Pa16 P1 P15 Pa18 Bat Ba2 


810 820 830 840 850 860 870 880 890 900 
YDSTYNNLSKYMTLVKYDKNPDFDSTLMSLILPLAWLFMPRGGEYPAYPFERRDGTFTEDESMFTARGAYKRRLLYDVSN!I REMI QQNSMVLDDDLLHEY 


Ba4 Bad Ba6é Ba7 Bos 


910 920 930 940 950 960 970 980 990 1000 

GFTGALLLIDLNILDLIDEVKKEDISPVKVNELATSLEQLGKLGEREKSRRAASDLK I RGHALSNDI VYGYGLQEK | QKSAMATKETTVQSKRVSSRLHE 
Bad Bat0 B1 Bat1 Bai2 Bai3 Bat4 

1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 
VIVAKTRDYKIPTMPADALHLYEFEVEDVTVDLLPHAKHTSYSNLAYNMSFGSDGWFAFALLGGLDRSANLLRLDVAS I RGNYHKFSYDDPVFKQGYKIY 

15 Bat Bat7 Bat8 Ba19 

1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 
KSDATLLNDFFVAISAGPKEQGILLRAFAYYSLYGNVEYHYVLSPRQLFFLSDNPVSAERLVRIPPSYYVSTQCRALYNI FSYLHILRS! TSNQGKRLGM 

Ba2 Ba21 
1210 1220 


VLHPGLIAYVRGTSQGAILPEADNV 


Extended Data Figure 6 | Sequence and secondary structure assignment of CPV RdRP in the quiescent state. a-helices were marked by cylinders, 
6-strands by arrows, loops by thin lines, and the flexible tip domain by dashed lines. The colour scheme is the same as Fig. 3a. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


(quiescent ) 


Extended Data Figure 7 | The RdRP-bound dsRNA in the quiescent and quiescent (d) and transcribing (e) states. f, g, Models of TEC (surface 


transcribing states. a—c, Location of a TEC on the inner surface of the representation) and dsRNA (ribbons) in the quiescent (f) and transcribing 
capsid shell in the quiescent and transcribing states. The inner surface of (g) states. Close-up views show the bound dsRNA (surface representation) 
the CPV capsid (a) with 10 CSPs labelled (CSP A.1/B.1 to CSP A.5/B.5). on RdRP in the quiescent state (f) and its detachment in the transcribing 
b, c, Position of a TEC on the inner surface of capsid in the quiescent (b) state (g). VP4 is coloured cyan and the RdRP is coloured as in Fig. 3a. All 
and transcribing (c) states. VP4 and RdRP are coloured cyan and purple, surfaces displayed in this figure were rendered from models, except for the 
respectively. An icosahedral five-fold axis is indicated with a small green density maps of RARP + dsRNA in d, e. 


pentagon. d, e, CryoEM densities of TEC and dsRNA (orange) in the 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


(quiescent) 


Extended Data Figure 8 | Tracing amino acid residues 910-932 and 
971-1000 of module B of the bracelet domain of RdRP in the quiescent 
and transcribing states. a, b, CryoEM densities of RdRP in the quiescent 

(a) and transcribing (b) states. The locations of the residues 910-932 and 
971-1000 are indicated with cyan boxes in a and b. Owing to their flexibility, 
these residues are not readily visible when displayed as in a and b but 
become visible when the maps are filtered to a lower resolution (for example, 
4.5 A resolution) as in c-f. The colour scheme of domains/subdomains is the 
same as in Fig. 3a. c, Trace of the residues 971-1000 (green) and 910-932 
(purple) of module B of the bracelet domain of RdRP in the quiescent state. 
d, The same as c but in a different view. e, Trace of the residues 971-1000 


(green) and 910-932 (purple) of module B of the bracelet domain of RdRP 
in the transcribing state. f, The same as e but in a different view to show the 
unambiguous trace of the two peptide fragments. g, h, Trace of the residues 
910-923 (g) (purple) and 926-932 (h) (purple) of the bracelet domain of 
RdRP in the transcribing state, showing the unambiguous trace of the two 
peptide fragments. i, j, CryoEM densities (grey) and model (ribbon) of 
RdRP in the transcribing state, showing a-helices (i) and a B-hairpin (j). The 
colour scheme of domains/subdomains is the same as in Fig. 3a, k, 1, Trace 
of the residues of the bracelet domain of RdRP in the transcribing state, 
showing a a-helix (k) and a B-sheet (1). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


(quiescent RdRP) 


(quiescent RdRP) * pe (transcribing RdRP) 


Extended Data Figure 9 | Stereo and rotated views of Fig. 4a, b. a, b, Stereo views of modules A (yellow cylinders and loops) and B (purple cylinders and 
loops) of the bracelet domain of RdRP in the quiescent (a) and transcribing (b) states. c, d, Same as in a, b, but rotated around the x axis by 90°. All surfaces 
displayed in this figure were rendered from models. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


Module B 


(q-CPV RdRP) 


(MRV RdRP) 


Extended Data Figure 10 | Comparisons of RdRPs from CPV and MRV. N-terminal (c), polymerase (d) and bracelet (e) domains. f-h, Comparisons 
a, b, CryoEM in situ structure of the RdRP in t-CPV (a) and crystal structure of modules A (yellow) and B (magenta) of the bracelet domain of RdRPs 

of the MRV RdRP (b), both containing a RNA duplex in the active site. c-e, from q-CPV (f), t-CPV (g) and MRV (h). i-k, The same as in f-h, but with 
Superposition of domains of RdRPs from t-CPV (colour) and MRV (grey): helices shown as cylinders, as in Fig. 4a, b. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15760 


Foreign DNA capture during CRISPR-Cas adaptive 


immunity 


James K. Nufiez"’, Lucas B. Harrington", Philip J. Kranzusch'?, Alan N. Engelman** & Jennifer A. Doudna 


Bacteria and archaea generate adaptive immunity against phages 
and plasmids by integrating foreign DNA of specific 30-40-base- 
pair lengths into clustered regularly interspaced short palindromic 
repeat (CRISPR) loci as spacer segments’ °. The universally 
conserved Casl1-Cas2 integrase complex catalyses spacer 
acquisition using a direct nucleophilic integration mechanism 
similar to retroviral integrases and transposases’ '*. How the 
Cas1-Cas2 complex selects foreign DNA substrates for integration 
remains unknown. Here we present X-ray crystal structures of the 
Escherichia coli Cas1-Cas2 complex bound to cognate 33-nucleotide 
protospacer DNA substrates. The protein complex creates a curved 
binding surface spanning the length of the DNA and splays the 
ends of the protospacer to allow each terminal nucleophilic 3’-OH 
to enter a channel leading into the Cas] active sites. Phosphodiester 
backbone interactions between the protospacer and the proteins 
explain the sequence-nonspecific substrate selection observed 
in vivo’ *. Our results uncover the structural basis for foreign DNA 
capture and the mechanism by which Cas1-Cas2 functions as a 
molecular ruler to dictate the sequence architecture of CRISPR loci. 

CRISPR loci are defined by repetitive elements that are separated by 
similarly sized spacer sequences acquired from foreign DNA during 
the adaptation stage of CRISPR-Cas adaptive immunity®™*. CRISPR 
transcripts generated from the loci assemble with Cas proteins to detect 
and cleave foreign nucleic acids bearing sequence complementarity 
to the spacer segment’”’”"””. In E. coli, expression of the Cas1-Cas2 
protein complex triggers acquisition of new 33-base-pair (bp) spacers at 
the A/T-rich leader end of the CRISPR locus’ "°°. How the Cas1-Cas2 
complex selects 33-bp protospacers of variable sequences and activates 
the 3’-OH ends for integration remains unknown. As the Cas1-Cas2 
complex is sufficient to initiate spacer acquisition and adaptation of 
the CRISPR-Cas immune system, we hypothesized that the protein 
complex alone must provide the structural basis for the unknown 
mechanism of spacer length determination. 

To determine how protospacer variation influences the efficiency of 
Cas1-Cas2-mediated spacer acquisition, we used an in vitro integration 
assay to test versions of a 33 bp sequence with constant overall length 
but different 3’ single-stranded overhang lengths’. The protospacer 
sequence is derived from the M13 bacteriophage genome and is highly 
acquired into the E. coli CRISPR locus after infection®. Unexpectedly, 
protospacers with overhanging 3’ nucleotides are strongly preferred by 
the Cas1-Cas2 complex over a completely double-stranded 33 bp pro- 
tospacer (Fig. la and Extended Data Fig. 1a, b). Single-stranded DNA 
and substrates with 5/ overhangs are poor substrates for integration, 
highlighting the ability of Cas1-Cas2 to select specific DNA substrates 
before integration’. The most preferred protospacer DNA for in vitro 
integration consists of five overhanging nucleotides on each 3’ end 
(Extended Data Fig. 1). To determine the molecular basis of Cas1-Cas2 


1,2,5,6,7,8 


protospacer capture, we assembled Cas1-Cas2 complexes with the pre- 
ferred protospacer substrate and determined crystal structures of the 
complex in the presence and absence of Mg™* at 3.0 A and 3.2A reso- 
lutions, respectively (Extended Data Fig. 2 and Extended Data Table 1). 

The structures reveal a hexameric protein architecture comprising 
four copies of Cas1 and two copies of Cas2, in which the protospacer 
spans the central Cas2 dimer and terminates within individual Cas1 
subunits on each end of the complex (Fig. 1b). Structural superposition 
of the Cas1—Cas2 complex with and without bound DNA reveals a 
DNA-induced change in Cas1 subunit orientation in which each Cas1 
dimer rotates ~10° in opposing directions against the central Cas2 hub 
(Extended Data Fig. 3a, b). Cas1-Cas2 protospacer capture positions 
each single-stranded protospacer 3’ end within a channel leading 
directly to a Cas] active site. Simulated annealing omit maps show clear 
electron density for the double-helical region and the five-nucleotide 
overhangs on each end of the protospacer (Extended Data Fig. 4a-c). 
The constrained protein channel guiding each DNA strand from its 


eo eu vei Gsaet fat Active site 1 
&ss verhang length (nt) Ga. > 
VYFFO0123 456789 : > 
mT TT Tr Integration = < 
10 jess product os 
23 H208 


ee ee ee ee] — SC. 


= 
a ot ee ee 
7 = ~Band X 


Integration (%): 2 4 29 44 43 55 66 69 341113 7 


D224 
3-OH > 


ef 


0°" 


Active site 2 


23 bp 


Figure 1 | Overall architecture and active site positioning of 3’-OH 
nucleophile. a, A representative agarose gel of in vitro integration reactions 
using increasing lengths of 3’ single-strand (ss) protospacer DNA overhangs. 
Per cent integration values are the average of three independent experiments. 
kb, kilobases; nt, nucleotide; S.C., supercoiled pCRISPR; Band X, relaxed 
pCRISPR byproduct (ref. 12). b, The overall architecture of Cas1-Cas2 bound 
to protospacer DNA. The line segments indicate the length of the DNA, 
spanning a total of 33 nucleotides. c, Stick configurations of the two Cas1 
active sites (blue subunits in b) that coordinate the nucleophilic 3’-OH ends 
of the protospacer (green arrow). Supplementary Information contains the 
full image for a. 


Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California 94720, USA. “Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, 
California 94720, USA. 3Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA. *Department of Medicine, Harvard Medical School, 
Boston, Massachusetts 02115, USA. Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA. °Physical Biosciences Division, Lawrence Berkeley National 
Laboratory, Berkeley, California 94720, USA. ‘Innovative Genomics Initiative, University of California, Berkeley, Berkeley, California 94720, USA. °Center for RNA Systems Biology, University of 


California, Berkeley, Berkeley, California 94720, USA. 
*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 535 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Integration (%) 


Figure 2 | Coordination of protospacer DNA within the complex. 

a, Electrostatic potential surface representation of the Cas1-Cas2 complex 
with the protospacer shown in yellow. b, Close up of the arginine channel 
that stabilizes the ssDNA overhang. ¢, Stick configuration representation 
of arginine clamp residues that coordinate the protospacer duplex 

region. d, Map of amino acid residues that coordinate the protospacer 
phosphodiester backbone (black dots). Residue colours indicate 
Cas1-Cas2 protomers from Fig. 1b. e, Agarose gels of in vivo spacer 
acquisition assays of arginine channel and clamp mutant proteins. 


double-helical region to the single-strand-accommodating Cas] active 
site explains the specificity of Casl-Cas2 for five-nucleotide 3’ over- 
hang substrates (Fig. 1a and Extended Data Fig. 1). Two of the four 
Cas1 subunits, coloured green in Fig. 1b, are not occupied with the 
protospacer 3’ ends and are probably non-catalytic, since the 3’/-OH 
nucleophile and the scissile phosphodiester bond of the target DNA 
must be in the same active site for direct nucleophilic integration. 

In the active sites, the 3’ terminal base is involved in a stacking inter- 
action with Y217 that positions the nucleophilic 3’-OH ends of the 
protospacer near the conserved metal-binding residues E141, H208 
and D221 (Fig. 1c). Although we cannot assign density for Mg” in 
the active sites, these three residues have been shown previously to 
coordinate a Mn** ion in the active site of Cas1 from Pseudomonas 
aeruginosa’. Furthermore, alanine mutations at these positions dis- 
rupt in vivo spacer acquisition”*””. Thus, the observed positioning of 
the 3’-OH nucleophiles and catalytic residues probably represents the 
active configuration of the nucleoprotein complex immediately before 
spacer integration. 

All interactions between Cas1-Cas2 and protospacer DNA involve 
coordination of the phosphate backbone rather than base-specific con- 
tacts, consistent with the variable sequence selection of protospacers 
that is essential for resistance to diverse foreign sequences” *. Two cen- 
tral regions of the Cas1-—Cas2 complex, which we term the ‘arginine 
clamp’ and the ‘arginine channel, stabilize the protospacer (Fig. 2a—d). 
The arginine clamp interacts with the middle of the duplex region 
where four Arg residues coordinate each DNA strand: Cas] R41 and 
Cas2 R16, R77 and R78 (Fig. 2c). Reverse charge mutations of Cas] R41 
and Cas2 R16 and R78 drastically reduce spacer acquisition in vivo, 
whereas the Cas2 R77E mutant functions similar to wild-type Cas2 
(Fig. 2e). Thus, Cas] R41, Cas2 R16 and R78 are the key constituents of 
the arginine clamp. The contribution of Cas2 to protospacer DNA bind- 
ing supports the previous hypothesis that the main function of Cas2 
is to form a non-catalytic scaffold within the Cas1-Cas2 complex’. 

Cas1 residues R66, R84, R245 and R248 line the arginine channel 
that stabilizes the junction where the duplex region terminates and the 
single-stranded DNA overhang enters the active site. Reverse charge 
mutations of each arginine lining the arginine channel disrupts spacer 
acquisition in vivo (Fig. 2e). In addition, purified Cas1 R59D or R66D 
proteins complexed with wild-type Cas2 are highly defective in inte- 
grating 33-bp duplex or five-nucleotide overhang protospacer sub- 
strates in vitro (Fig. 2f). Fluorescence polarization assays demonstrate 
that the mutant complexes exhibit dramatically reduced affinity for 


536 | NATURE | VOL 527 | 26 NOVEMBER 2015 


y 
SSe ert 
R163 

R84 
R66 


aovaanie arah'® R77 R245 R66 Red py63 
\J Ela a 


AtrTACTACTCGTTCT GGTaTITCTC® 
PEELE Pdr tr rrr bb bbb eae 
TAAATGATGAGCAAGACCACAAA 
" ee 8 
/\ \ par’ | VN R78 i \ 
R245 RATT hig R59 R248 


Arg channel 


Arg clamp 


Se & & ie & 
PE PELE 


Spacer 
acquisition 


m= dsDNA 
m= 5-nt overhangs 


K,>5 uM 


Anisotropy (mA) 


LQ 


S S 0.01 0.1 1 10 100 1,000 10,000 
xe) Ke) 
e & 


[Cas1] (nM) 


€ 


WT, wild type. f, Plot of per cent in vitro integration of either double-stranded 
DNA (dsDNA; black) or 5-nucleotide (nt) overhang (blue) protospacers 

with wild-type Cas1, Cas1(R59D) or Cas1(R66D) complexed with Cas2. 

g, Fluorescence polarization binding assays of a 5-nucleotide overhang 
protospacer with the same mutants in f complexed with Cas2. The calculated 
relative binding affinities (K,) are indicated. Error bars represent the standard 
deviation of three independent experiments. Data in panel e-g are results of 
at least three biological replicates. Supplementary Information contains the 
full images for e. 


protospacer DNA, highlighting the critical role of this part of the Cas1- 
Cas2 complex for protospacer capture and complex stability (Fig. 2g). 

The Cas1-Cas2-DNA crystal structures uncover a protein wedge 
that terminates the protospacer double-stranded DNA region and 
allows single-stranded DNA overhangs to enter the arginine channel. A 
stacking interaction of the 5’ terminal base (adenine 6 in Fig. 3a, b) with 
Y22 of Cas! stabilizes protospacer duplex unwinding, directing each 
single-stranded 3’ overhang to sharply bend ~90° away from the duplex 
and into the active site channel (Fig. 3b). A mutation of Y22 to alanine 
reduces spacer acquisition in vivo, whereas a phenylalanine mutation 
has near wild-type levels of acquisition, consistent with a specific role 
for Cas] Y22 base-stacking in protospacer strand splaying (Fig. 3c). 


a Nucleophilic Displaced ! c x\ 
oe, P! P| 6 ) PK 
Sg. an SS WM lh 
oe ee er et ks begs 5 sate 
YO2@I ti reitir tire eee t tt 1@Y22 600 
GATTTACTACTCGTTCTGGTGTT TC, 500 
Co & 
ae, . 400 
¢* Displaced 
Nucleophilic 300 
b Integration (%) 0 4449 1345 2642 
‘A Displaced 
<0 strand d 
i 80) mm WT 
& 60 mm Y22A 
i= 
2 
= ro 40 
3 
Zt a € 20 
Fahey) -- 
x v. 5 
Nuci@Spnii¢ny g FN Yo % 
l a | 
bey Siegel wd . & No. of splayed nucleotides 


Figure 3 | Mechanism of protospacer DNA end separation. a, The 
5-nucleotide splayed protospacer sequence used for crystallization to 
determine the trajectory of the displaced non-nucleophilic strand. Cas] Y22, 
involved in base stacking at the fork, is shown in blue. b, Close up of the DNA 
fork showing the base stacking interaction of Y22 with the terminal adenine 
nucleotide of the non-nucleophilic strand. The nucleotides are numbered 
from 5’ to 3’ of each DNA strand shown in a. The grey mesh shows the 

2F, — F. density contoured at 2.26 of the first ejected nucleotide of the 
displaced strand. The arrows indicate the opposite trajectories of each strand. 
c, Agarose gel of in vivo acquisition assay of co-expressed wild-type (WT) 
Cas] or the indicated Cas1 mutant with Cas2. Quantification is the mean 

of three independent experiments + standard deviation. d, Plot of per cent 
integration of increasing number of splayed nucleotides at the protospacer 
ends using wild-type Cas1 (blue) or Cas1(Y22A) (blue) complexed with 
Cas2. Error bars represent the standard deviation of three independent 
experiments. Supplementary Information contains the full image for c. 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 
b c 
Modelled 
target DNA 
H208 
Protospacer DNA ~ 
Figure 4 | Model of protospacer DNA integration. a, View of crystal 
packing from a symmetry mate complex (grey) showing coordination of the 


symmetry DNA along a Cas] active site. The inset is a magnified view of the 
coordination of the phosphodiester backbone with metal-binding Serna 
E141, H208 and D221. The mesh represents a F, — F. density for a Mg” 
contoured at 2.20. b, c, Model of protospacer DNA integration into ree. 
DNA (black) and positioning of the scissile phosphate (green arrow) and the 
3’-OH nucleophile in the Cas] active site. 


Sequence alignment of representative Cas1 proteins in type I CRISPR 
systems reveals that Y22 is not universally conserved in other bacteria, 
suggesting that additional or different Cas1 residues may stabilize the 
splayed ends in other CRISPR-Cas systems (Extended Data Fig. 5). 

The observed stacking interaction raises the possibility that fully 
duplexed protospacers are separated by Cas1 Y22, thereby displacing 
the 5’ end of the duplex, which we term the non-nucleophilic strand, 
from the nucleophilic strand carrying the 3’-OH. DNA transposases 
and retroviral integrases also utilize end fraying to isolate the reactive 
DNA strands for chemistry within enzyme active sites” ~*. To test this 
potential activity of Cas1-Cas2, we introduced an increasing number 
of mismatches at the ends of the 33 bp protospacer to disrupt end base 
pairing and assayed their potential for in vitro integration (Fig. 3d and 
Extended Data Fig. 6a, b). Similar to the 3’ overhang substrates, the 
4- and 5-nucleotide frayed ends are highly preferred, presumably due 
to the lower energy required for capture of these substrates compared 
to perfectly duplexed ends (Fig. 3d). The complex containing the Cas1 
Y22A mutant regains marginal activity with substrates containing 
5- or 6-nucleotide splayed ends, suggesting that Y22 steers the 
non-nucleophilic DNA strand away from the active site (Fig. 3d). 
Notably, the displaced non-nucleophilic strand is not cleaved into a 
shorter fragment by Cas1—Casz2, as the protospacer ends are not pro- 
cessed during integration (Extended Data Fig. 6c). 

To determine the trajectory of the displaced non-nucleophilic strand 
after end-splaying, we crystallized Cas1-Cas2 with a protospacer with 
five-nucleotide frayed ends on both sides (Fig. 3a, b). The electron 
density at the fork is similar to the structures described above, except 
that we observe the first nucleotide of the displaced non-nucleophilic 
strand pointing in the opposite direction from the nucleophilic 
single-stranded DNA strand. Clear electron density is not observed 
for the remaining nucleotides of the displaced strand, indicating that 
they are not stabilized by the complex. 

An alternative crystal form grown in the presence of Mg” reveals 
secondary Cas1-DNA interactions that provide additional insight into 
the mechanism of Cas1-Cas2 genomic DNA target binding and sub- 
sequent integration. In addition to the two Cas] ‘catalytic active sites 
carrying the 3’-OH ends of the protospacer, the ‘non-catalytic’ Cas1 


LETTER 


active sites interact with the protospacer DNA from a symmetry mate, 
revealing a possible coordination of the target DNA during integration 
(Fig. 4a and Extended Data Fig. 7a). The non-catalytic Cas] engages 
the DNA minor groove by contacts with a-helix 7, causing a slight kink 
on the DNA compared to our alternative crystal form lacking Mg** 
(Extended Data Fig. 7b). A close-up of the active site shows continuous 
density for Mg”* with E141, H208, D221 anda phosphate backbone of 
the presumed target DNA, capturing a snapshot of scissile phosphodi- 
ester bond coordination before integration (Fig. 4a). 

Because integration must occur in the active site that coordinates the 
3’-OH of the protospacer DNA, we modelled the protein-DNA inter- 
actions from the non-catalytic Cas] active sites into the catalytic Cas1 
active sites, This reveals the positioning of the nucleophilic 3’-OH of the 
protospacer ends for attacking the scissile phophodiester bond in the 
modelled DNA (Fig. 4b, c). Further work will be needed to shed light 
on how the complex specifically recognizes the leader-repeat region 
of the CRISPR locus for integration, as recently observed in vitro!!"9, 

Together, these data explain key aspects of Cas1-Cas2 integrase- 
mediated acquisition of new DNA into bacterial genomes. First, we 
show that the substrates for integration are double-stranded DNA. 
Importantly, however, optimal substrates include a central 23 bp hel- 
ical region flanked by five single-stranded nucleotides on each 3’ end. 
If substrates for CRISPR integration come from single-stranded DNA 
products of RecBCD, as recently suggested, they must somehow anneal 
or otherwise become double stranded before Cas1-Cas2 capture”. It 
remains unclear how the Cas1-Cas2 complex recognizes the AAG 
protospacer adjacent motif during protospacer selection, since the 
terminal nucleotides containing the 3’-OH nucleophiles are coordi- 
nated similarly in the Cas] active sites (Fig. 1). Second, the Cas1-Cas2 
integrase architecture specifies the precise length of integrated DNA, 
ensuring uniformity of spacer lengths within CRISPR loci. Finally, the 
structure-based model of DNA target sequence positioning suggests that 
in addition to catalysing the integration reaction, Cas1 plays a role in 
binding the target CRISPR locus. Target binding could possibly disrupt 
the structural symmetry observed in the crystal structure to coordi- 
nate the sequence- specific integration reactions at the leader-end of the 
CRISPR locus. Insights into target site recognition may offer strategies 
for altering or enhancing integration site specificity, with implications 
for use of the Cas1-Cas2 integrase as a genome-modifying technology. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 31 August; accepted 6 October 2015. 
Published online 21 October 2015. 


1. Barrangou, R. et al. CRISPR provides acquired resistance against viruses in 
prokaryotes. Science 315, 1709-1712 (2007). 

2. Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J. & Soria, E. Intervening 
sequences of regularly spaced prokaryotic repeats derive from foreign genetic 
elements. J. Mol. Evol. 60, 174-182 (2005). 

3. Bolotin, A. Quinquis, B., Sorokin, A. & Ehrlich, S. D. Clustered regularly 
interspaced short palindrome repeats (CRISPRs) have spacers of 
extrachromosomal origin. Microbiology 151, 2551-2561 (2005). 

4. Pourcel, C., Salvignol, G. & Vergnaud, G. CRISPR elements in Yersinia pestis 
acquire new repeats by preferential uptake of bacteriophage DNA, and provide 
additional tools for evolutionary studies. Microbiology 151, 653-663 (2005). 

5. Garneau, J. E. et al. The CRISPR/Cas bacterial immune system cleaves 
bacteriophage and plasmid DNA. Nature 468, 67-71 (2010). 

6. van der Oost, J., Westra, E. R., Jackson, R. N. & Wiedenheft, B. Unravelling the 
structural and mechanistic basis of CRISPR-Cas systems. Nature Rev. 
Microbiol. 12, 479-492 (2014). 

7. Yosef, |., Goren, M. G. & Qimron, U. Proteins and DNA elements essential for the 
CRISPR adaptation process in Escherichia coli. Nucleic Acids Res. 40, 
5569-5576 (2012). 

8. Datsenko, K. A. et a/. Molecular memory of prior infections activates the 
CRISPR/Cas adaptive bacterial immunity system. Nature Comm. 3, 945 
(2012). 

9. Swarts, D. C., Mosterd, C., van Passel, M. W. & Brouns, S. J. CRISPR interference 
directs strand specific spacer acquisition. PLoS ONE 7, e35888 (2012). 

10. Nufiez, J. K. et al. Cas1-Cas2 complex formation mediates spacer acquisition 
during CRISPR-Cas adaptive immunity. Nature Struct. Mol. Biol. 21, 528-534 
(2014). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 537 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


11. Arslan, Z., Hermanns, V., Wurm, R., Wagner, R. & Pul, U. Detection and 
characterization of spacer integration intermediates in type I-E CRISPR-Cas 
system. Nucleic Acids Res. 42, 7884-7893 (2014). 

12. Nufiez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. Integrase-mediated spacer 
acquisition during CRISPR-Cas adaptive immunity. Nature 519, 193-198 
(2015). 

13. Rollie, C., Schneider, S., Brinkmann, A. S., Bolt, E. L. & White, M. F. Intrinsic 
sequence specificity of the Cas1 integrase directs new spacer acquisition. eLife 
4, 10.7554/eLife.08716 (2015). 

14. Heler, R., Marraffini, L.A. & Bikard, D. Adapting to new threats: the generation 
of memory by CRISPR-Cas immune systems. Mol. Microbiol. 93, 1-9 (2014). 

15. Brouns, S. J. et al. Small CRISPR RNAs guide antiviral defense in prokaryotes. 
Science 321, 960-964 (2008). 

16. Carte, J., Wang, R., Li, H., Terns, R. M. & Terns, M. P. Cas6 is an 
endoribonuclease that generates guide RNAs for invader defense in 
prokaryotes. Genes Dev. 22, 3489-3496 (2008). 

17. Haurwitz, R. E., Jinek, M., Wiedenheft, B., Zhou, K. & Doudna, J. A. Sequence- 
and structure-specific RNA processing by a CRISPR endonuclease. Science 
329, 1355-1358 (2010). 

18. Deltcheva, E. et a/. CRISPR RNA maturation by trans-encoded small RNA and 
host factor RNase Ill. Nature 471, 602-607 (2011). 

19. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in 
adaptive bacterial immunity. Science 337, 816-821 (2012). 

20. Levy, A. et al. CRISPR adaptation biases explain preference for acquisition of 
foreign DNA. Nature 520, 505-510 (2015). 

21. Wiedenheft, B. et a/. Structural basis for DNase activity of a conserved protein 
implicted in CRISPR-mediated genome defense. Structure 17, 904-912 
(2009). 

22. Savilahti, H., Rice, P. A. & Mizuuchi, K. The phage Mu transpososome core: DNA 
requirements for assembly and function. EMBO J. 14, 4893-4903 (1995). 

23. Scottoline, B. P., Chow, S., Ellison, V. & Brown, P. O. Disruption of the terminal 
base pairs of retroviral DNA during integration. Genes Dev. 11, 371-382 
(1997). 


538 | NATURE | VOL 527 | 26 NOVEMBER 2015 


24. Katz, R. A., Merkel, G., Andrake, M. D., Roder, H. & Skalka, A. M. Retroviral 
integrases promote fraying of viral DNA ends. J. Biol. Chem. 286, 
25710-25718 (2011). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank G. Meigs and the 8.3.1 beamline staff at the 
Advanced Light Source for assistance with data collection, J. Chen for input on 
experimental design and members of the Doudna laboratory for comments 
and discussions. The 8.3.1 beamline is supported by UC Office of the President, 
Multicampus Research Programs and Initiatives grant MR-15-328599 and 
Program for Breakthrough Biomedical Research, which is partially funded 

by the Sandler Foundation. This project was funded by US National Science 
Foundation grant No. 1244557 to J.A.D. and by NIH grant Al070042 to A.N.E. 
J.K.N. and L.B.H. are supported by US National Science Foundation Graduate 
Research Fellowships and J.K.N. by a UC Berkeley Chancellor’s Graduate 
Fellowship. PJ.K. is supported as a Howard Hughes Medical Institute Fellow of 
the Life Sciences Research Foundation. J.A.D. is an Investigator of the Howard 
Hughes Medical Institute and a member of the Center for RNA Systems Biology. 


Author Contributions J.K.N. and L.B.H. conducted the crystallography, 
biochemistry and in vivo spacer acquisition assays. J.K.N., L.B.H. and PJ.K. 
collected the X-ray diffraction data and determined the crystal structures. J.K.N., 
L.B.H., PJ.K., A.N.E. and J.A.D. designed the study, analysed all data and wrote 
the manuscript. 


Author Information Atomic coordinates and structure factors for the reported 
crystal structures have been deposited at the Protein Data Bank under 
accession codes 5DS4 (no Mg*"), 5DS5 (with Mg**) and 5DS6 (splayed DNA). 
Reprints and permissions information is available at www.nature.com/reprints. 
The authors declare no competing financial interests. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to J.A.D. (doudna@berkeley.edu). 


© 2015 Macmillan Publishers Limited. All rights reserved 


METHODS 

No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Cas1, Cas2 and DNA preparation. The Cas1 and Cas2 proteins from E. coli 
K12 (MG1655) were cloned and separately purified as previously described’®. 
Single-stranded DNA (ssDNA) oligonucleotides purchased from Integrated 
DNA Technologies were annealed in 20 mM HEPES-NaOH, pH 7.5, 25mM KCl, 
10mM MgCl, by heating at 95°C for 3 min and slow cooling to room temperature. 
The pCRISPR DNA target for in vitro integration was constructed as previously 
described”. The DNA substrates used for crystallization were gel-purified before 
complex formation. The sequences for the five-nucleotide overhang substrates 
used for crystallization are: ssDNA1, 5'-ATTTACTACTCGTTCTGGTGTTTCT 
CGT-3’; and ssDNA2, 5‘-AAACACCAGAACGAGTAGTAAATTGGGC-3’. The 
sequences for the five-nucleotide splayed substrates are: ssDNA1, 5‘-TAAACAT 
TTACTACTCGTTCTGGTGTTTCTCGT-3’; and ssDNA2, 5’-CATCTAAACAC 
CAGAACGAGTAGTAAATTGGGC-3’. 

In vivo acquisition and in vitro integration assays. The in vivo acquisition 
assays were performed as previously described’. The in vitro integration reac- 
tions were conducted as previously described with slight modifications’”. After 
pre-incubation of equimolar Cas1 and Cas2 at 4°C, 100 nM of the resulting Cas1- 
Cas2 complex was incubated with 100 nM protospacer DNA for an additional 
10-15 min at room temperature. The integration reaction was activated by the 
addition of 300 ng (~5nM) pCRISPR, incubated at 37°C for 1h and quenched with 
DNA loading buffer supplemented with EDTA at a final concentration of 20mM. 
The reaction products were analysed on 1.5% agarose gels. Per cent integration 
activity values were determined by quantifying the band intensity of the relaxed 
pCRISPR product and dividing over the intensity of all bands detected by Image 
Lab Software (Bio-Rad). We note that the integration activity could be a mixture 
of half-site and full-site integration products, as described previously’. 
Complex formation, crystallization and structure determination. Purified Cas1 
and Cas2 were incubated with protospacer DNA at equimolar concentrations 
(50 uM) in buffer A (500 mM KCI, 20mM HEPES-NaOH, pH 7.5, 1mM DTT, 
10mM EDTA), followed by overnight dialysis at 4°C against buffer B (100 mM 
KCl, 20mM HEPES-NaOH, pH 7.5, 1mM DTT, 5mM EDTA). The dialysed sam- 
ple was applied on a Superdex 75 10/300 column (GE Healthcare) in buffer B. 
Peak fractions were pooled and concentrated to ~3 mg ml ' for crystallization. 
Optimized crystals were grown by hanging-drop vapour diffusion at room tem- 
perature in two different conditions, as described in the text. The Mg”*-containing 
crystals grew as gem-like morphologies in 50 mM MES, pH 6.1, 10% isopropanol 
and 20mM MgCl,. The Mg’*-free crystals grew as rods in 100mM sodium citrate 
tribasic pH 5.6, 200mM sodium acetate and 8% PEG 8000 (w/v). The crystals were 
briefly transferred into a drop containing either 25% ethylene glycol (with Mg”* 
crystals) or 30% glycerol (without Mg” crystals) for cryoprotection and frozen in 
liquid nitrogen. The Cas1-Cas2 complex with a splayed DNA substrate crystallized 
in the same conditions as the Mg”*-free crystals. 


LETTER 


X-ray diffraction data were collected under cryogenic conditions at beam- 
line 8.3.1 at the Lawrence Berkeley National Laboratory Advanced Light Source. 
Initial phases were obtained by sequential molecular replacement using individual 
protein components of the Cas1-Cas2 apo structure (Protein Data Bank (PDB) 
accession number 4P6I) as search models. Following initial placement of two Cas1 
dimers and a dimer of Cas2, phases were improved by performing one round of 
rigid body refinement in PHENIX~’. The resulting maps showed clear unbiased 
density for protospacer DNA, and subsequent model building was performed 
through iterative rounds of building in Coot” and refinement in PHENIX with 
NCS restraints on the protein subunits. The asymmetric unit of the three struc- 
tures contains one copy of the Cas1-Cas2 complex bound to protospacer DNA. 
Statistics for the final crystal structures are reported in Extended Data Table 1. 
The final structures are missing clear density for the loop connecting a6 and a7 
of Cas1. We assume this loop to be highly disordered as it is also not observed 
in the apo E. coli Cas1 crystal structure (PDB 3NKD) and the apo Cas1-Cas2 
complex (PDB 4P61) 107”, 

Fluorescence polarization. Fluorescence polarization assays were performed in 
20mM HEPES-NaOH, pH 7.5, 25 mM KCl, 5mM EDTA, 1 ug ml 'BSA and 1mM 
DTT. Cas1-Cas2 were complexed and purified over gel filtration for all binding 
assays. The 3/-fluorescein labelled DNA substrate was added to the protein solu- 
tion at a final concentration of 5nM and the DNA-protein mixture was allowed 
to incubate for 30 min at 22°C. Measurements were made by excitation at 485 nm 
and monitoring emission at 535 nm. Data were fit to a binding isotherm to obtain 
K,. Each experiment was conducted in triplicate and error bars represent the 
standard deviation. 

Sequence alignment. The cas1 sequences were obtained from the National 
Center for Biotechnology Information (NCBI) Gene Data Bank. A representative 
cas1 from each CRISPR type I subtype were chosen based on previous subtype 
assignments and the alignment was generated using MAFFT”™*”’. The organ- 
isms chosen for the alignment are: Escherichia coli K-12, Cronobacter dublin- 
ensis str. 582, Erwinia amylovora, Yersinia pestis biovar Antiqua str. B42003004, 
Yersinia kristensenii, Hafnia alvei, Sulfolobus solfataricus, Thermotoga maritima, 
Pseudothermotoga lettingae, Deferribacter desulfuricans, Desulfovibrio vulgaris, 
Bacillus halodurans, Bacillus cereus, Synechocystis sp. PCC 6803, Cyanothece sp. 
PCC 8802 and Limnoraphis robusta. 


25. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for 
macromolecular structure solution. Acta Crystallogr. D 66, 213-221 (2010). 

26. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. 
Acta Crystallogr. D 60, 2126-2132 (2004). 

27. Babu, M. et a/. A dual function of the CRISPR-Cas system in bacterial antivirus 
immunity and DNA repair. Mol. Microbiol. 79, 484-502 (2011). 

28. Makarova, K. S. et al. Evolution and classification of the CRISPR-Cas systems. 
Nature Rev. Microbiol. 9, 467-477 (2011). 

29. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software 
version 7: improvements in performance and usability. Mo/. Biol. Evol. 30, 
772-780 (2013). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
100 Overhang length dependence 
§ 80 
© 
> 60 
2 
= 40 
Xs 
20 
: 1 2 3 6 
Nov 0 4 5 7 8 9 
ww w 
& & 3' nt overhang length 
b 
Full 5’ GCCCAATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 5 ATTTACTACTCGTTCTGGTGTTTCTCGT 
dsDNA 3’: CGGGTTAAATGATGAGCAAGACCACAAAGAGCA | 5’ 3’ CGGGTTAAATGATGAGCAAGACCACAAA 
ant. 2 _CCCAATTTACTACTCGTICTGGTGTTTCTCGT 3’ 5 TTTACTACTCGTTCTGGTGTTTCTCGT 
3’ CGGGTTAAATGATGAGCAAGACCACAAAGAGC 5’ 3’ CGGGTTAAATGATGAGCAAGACCACAA 
5’ — CCAATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 5’ TTACTACTCGTTCTGGTGTTTCTCGT 
2nt 3° CGGGTTAAATGATGAGCAAGACCACAAAGAG 5! 3’ CGGGTTAAATGATGAGCAAGACCACA 
5’ CAATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 5 TACTACTCGTTCTGGTGTTTCTCGT 
3nt 3° CGGGTTAAATGATGAGCAAGACCACAAAGA 5 3’ CGGGTTAAATGATGAGCAAGACCAC 
5’ AATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 5 ACTACTCGTTCTGGTGTTTCTCGT 
4nt 3° CGGGTTAAATGATGAGCAAGACCACAAAG 5 3’ CGGGTTAAATGATGAGCAAGACCA 


3° 
5) 


5 nt 


6 nt 


7nt 


8 nt 


9 nt 


Extended Data Figure 1 | Effect of overhang length on integration efficiency. a, A plot of the per cent integration of protospacers + standard deviation with 
varying 3’ single-stranded DNA extensions. A representative gel is shown in Fig. la. b, Protospacer sequences used for the assays described in a and Fig. 1a, 


with the red nucleotides indicating the 3’ overhang regions. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
> 
pee Peak1 Peak 2 we Peak 1 Peak 2 
_~ 170 : 
= 130 re 
<x 100 | y 4 
S 70 | 2. : 
s ‘0 & a 
P=] 40 r - 
Q 35 —  —— <_ Cas1 a ; & 
2 25 , 
< 15 
<  Cas2 
10 
0 5 10 15 20 
Volume (mL) 
d e 
i eis) 
REGS 
kDa Peak 1 9& oe 
170 
2 130 
100 
E 0 
5 55 
S 40 
. 35 om” => " < Cas1 
Be 25 
<x 
je < Cas2 
10 
0 5 10 15 20 
Volume (mL) 
Extended Data Figure 2 | Assembly of Cas1-Cas2 complex bound unbound DNA (second peak). b, c, The fractions from peak 1 (~12 ml) and 
to protospacer DNA. a, Gel filtration chromatogram of pre-assembled peak 2 (~15 ml) were analysed by Coomassie-stained SDS-PAGE (b) and 
Cas1-Cas2 complex with protospacer DNA containing five-nucleotide 3’ 12% urea-PAGE (c) to confirm the presence of Cas1, Cas2 and protospacer 
overhangs. The dotted lines indicate the peak fractions of the Cas1-Cas2 DNA. d, Gel-filtration chromatogram of assembled Cas1—Cas2 without 
complex without DNA, as shown in d. The dotted lines indicate the peak protospacer DNA. e, Coomassie-stained SDS-PAGE of the peak fractions 
fractions of the Cas1-Cas2 complex bound to DNA (first peak) and excess, from d. Supplementary Information contains the full images for b, c and e. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


fa apo Cas1—Cas2 


Cas1 Cas1 


Extended Data Figure 3 | Conformational dynamics upon protospacer DNA binding. a, An overlay of the DNA-bound Cas1-Cas2 structure with the 
apo Cas1-Cas2 (grey, PDB 4P6I). b, Vector lines depicting the conformational changes the Cas1-Cas2 complex undergoes upon protospacer DNA binding 
compared to the apo complex (PDB 4P6I). The Cas] subunits rotate towards the direction of the arrows. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
b c 
Structure with Mg?* Structure without Mg?* 
Terminus 1 Terminus 2 Terminus 1 Terminus 2 
3’ end 3’ end 3’ end 


Extended Data Figure 4 | Omit maps of the protospacer DNA. a, Simulated annealing F,— F. omit electron density map of the entire protospacer DNA using 
the ‘no Mg”’’ map and model. b, c, Simulated annealing F,— F. omit electron density maps of the terminal five nucleotides in the active sites of the structures 
(a) with Mg’* or (b) without Mg’t in the crystallization condition. The maps are contoured at 2.00. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


R41 R59 R66 
E. coli 1 ee ~—--GQIDVI DGAEVLI DKTG-~--1 RT ~~ -Hll PVGSVACIMLE PGTRVSHAAVRLAA 69 
1-E | C. dublin. 1 - ---GQI DVI DGAFVLVDATG----VRT ---HI PVGSVACI MLE PGTRV SHAAVK LAA 69 
E. amylov. 1 --------GQIDVLDGAFVLI DK SG----1RT---HI PVGNIACIMLE PGTRV SHAAVR LAA 69 
Y. pestis 1 M---ENAIHSSDLKTI ------ --------CRVLVNGGRV EY VTDEG----KQSLYWNI PIANTTVIMLGTGT SVTQAAMREFA 76 
1-F | Y. kriste. 1 MLAMDNSIHSSDLKTI - LHSKRSNI -CRVLVNGGRV EY VTD EG~~~-KQSLYWNI PIANTTVIMLGTGT SVTQAAMREFA 79 
H. alvei 1 -LHSKRSNIWY LQY|----------- -+- ~--------CRVLVNGGRV EYVTDEG----RESLYWNI PI ANTTVVMLGTGT SVTQAAMREFA 77 
1-A | S. solfat. 1 ISLELLITI FK EVI PNLPMDKKI A FVKDY GAY LKV EKGLITCK I KNQ----VKW---SIAPTELHSIVVLTNSSI SSEVVKVAN 111 
T. mariti. 1 ---SGTLKRKANTICLET ESG~---RK~----Y1 PVENVMDIKV FGEVDLNKRFLEFLS 99 
1-B | P lettin. 1 --DGTLKRK ENTI MLETQSG----KK----HI PVENVSEIKI FGEVDINKRLLEFLT 56 
D. desulf. 1 ---DGELKRKDNTLFFIKDNE-~~--KK-~~--TI PINAVSEIHVFGEIDINKRALEFLT 96 
D. vulgaris 1 ---GTY LAK EGECI VVRVGDE----VRL---RV PVHS LGGVVC FGOV SCS PFLMGFAA 61 
1-C | B. halodu. 1 --DTY LSLDGDNVVLLK EQE~~~-KLG-~~-RLPLHNLEAIVGFGYTGASPALMGYCA 61 
B. cereus 1 ---DVY LSLDGDNIVLLK EK E~~--K1G---RLPLHNLESIVSFGYTGTSPALMGYCA 61 
Synech. sp. 1 ---DAVLSKKH EA FHVA LK QEDG SWKKQ--- PI PAQT LEDIVLLGY PSITGEALGYAL 61 
1-D | Cyanot. sp. 1 --DAI LSKQQEAFKI ALK S EDGTWKKQ-~-S1 PAQT LEDIVLLGY PSMTGEAFGYAL 61 
L. robusta 1 ---DAI LNKKY EAFIVSLK QEDGTWSKH - - -SVAAQTV EQVVLMGN PQITGDALSYAL 61 
E. coli Strx —_)] A LAY 
B1 p2 B3 p4 ats BS B6 a2 
R84 
e 
E. coli 70 QVGT LLVWVGEAGVRVYA SG -QPGGA - -----RSDK LLYQAK LAL --DEDLRLKVVRKMFELR FGEP- - -APARRSVEQLRGI 140 
1-E | C. dublin. 70 QTGT LLVWGEAGVR LYASG -QPGGA - R SDK LLYQAQLAL~-DSDLRLKVVRKMFEVR FHEP - - - APERRSVEQLRGI 140 
E. amylov. 70 TVGT LLVWWGEAGVR LYASG -QPGGA - ------RSDK LLYQAK LAL --DGDLRLKVVRKMFELR FG EQ- -- --- - APARRSVEQLRGI 140 
Y. pestis 77 RAGV LVG FCGGGGT PL FAANDV EVNV SWLTAQS EYR PT EY LHDW S FWF - -DDEKR LAAAVA FQRI RR! AQ! QQHWLS SHI QR ESLF PVN --HDQLLFI LTR FE--~-QNLANCLT SNDLMVQ 190 
1-F | Y. kriste. 80 RAGV LVG FCGGGGA PL FAAN EV EVNV SWLTAQS EYR PT EY LQDW S FWF - -DDTKR LAAAVA FQRI RI TQ! QQHWLS SHMQR E PLFQVN --RDQLQS! LNR F E~--QNLTHCQT SNDLMAQ 193 
H. alvei 78 RAGV LVG FCGGGGT PL FAAN EV EVDV SWLA PQS EYR PT EY LQNW S FWF - -DDDKR LAAAVR FQRVRT EQI RRHWLGSLMQR EER FRVD -~~ESR LQALLQRY E---QNLEKCANHTDLMVQ 191 
1-A | S. solfat. 112 EYGI EI V FFNKHEP--YAK --~-------LI PAKYAGSFKVWLKQLTAWK - ~--RRKV EFAKAFIYGKVHNQW - -T_LRYY ERKYGYNLT --SQELDRLAR E -- - -ITFVNTAEEVMQK 206 
T. mariti. 56 QKRI PIHFFNREGY --YVGT - - FY PREY LNSGFL1I LKQAEHY | --NQEKRMLI AR EI VSR S FQNMVD - - FLKKRKVRA -- ~~ AEEASNVSELMGI 147 
1-B | P lettin. 57 QKNI | VHFFNRYGY --YVGS--------YY PREFLNSGMI | LRQAEFY L--NSVKRLELARLFVEGSLKNI I N--TIKKYNENGRFSNT -~-1 NAI ESHIKN----~--LKQCI SVEQLMAL 154 
D. desulf. 57 KNKI PLY FYNYYGY --YIGS-------- YY PREY LNAGI I | LKQAEFY L--NK EERLFLAK SFVSGGLSNI LK --NLNYYKKTKLEKIT PY! EQI EEK SKK ------- INNKSSISSLMAL 156 
D. vulgaris 62 ERGLGFSFLTEHGR --FLAR-VQGPV ~-----SGNV LLRR EQY RRADSPEA SAEVAR SI V SAKVVNARG - -V LQRAMRDHGDK VD - -GVALEA EV LHLRAC LMR LQQPAG LDAVRG| 165 
1-C | B.halodu. 62 ERNI' SIT FLTKNGR --FLAR -VVGES - -----RGNVVLRKTQYRI SENDQESTKIARNFITGKVYNSKW--MLERMTR EH PLRVN ~-V EQFKAT SQLLSVMMQEI RNCDSLESLRGW 165 
B. cereus 62 EKNI SLVFLTMYGQ--FLAR-VIGKS -KGNV1 LRKKQYR1 SEDEVI SAKIARNFIVGK1YNNKW--1 1 ERMTRDY PLRID--VDQFKAI SQHLSSI1 LEVRECEDLERLRGL 165 
Synech. sp. 62 ELGLPVHY LTQFGK - -YVGS - ----ALPSE SRNGQLR LAQFRAHEDP! QRLDI VKA FVKGKVHNQ- -VMRQQT LEQVRGI 152 
1-D | Cyanot.sp. 62 ELGLPVHY LSRFGK --YVGS --------ALPNE--~-SRNGQLR LAQFRAHENPNQR LDI VKAI AKGK | HNQ- -V LQKQSLDEVRGv 152 
L. robusta 62 ELGMPVHY LSQYGK --YIGT -------- T LPGY - -- SRNGQLR LAQYATHCHEEKRLELVK I | VAGK | HNQ-----SSVLYRYGQKD - ----NPLK LRKKQ------- VCEQKT LDQVRG! 152 
E. coli Strx —2>——=>_@£$@$ REEF 222. 2 2 —_— _ aR 
B7 68 a3 a4 
E141 R163 H208 p22 R25 B248 
E. coli 141 EGSRVRATYALL - - -----AKQYGVTWNGRRY D PK DWEKGDT | NOCI SAAT SCLYGVT EAA! LAAGYA PAI GFVHTGK P--LSFVYDIADI 1 K FDTVV PKAFEIARRNPGEPDREVRLACR 202 
1-E | C.dublin. = 141 EGTRVKAIYKLL--- -AQQY GVNWNGRKY D PKDWEKGDI VNQCI STAT SCLYGITEAAI LAAGYA PAI GFVHTGK P--LSFVYDIADLI K FDAVV PKAF EI AARQPFK PDQEVRLACR 292 
E.amylov. 141 EGGRVKATY SLL-------AKQYGV EWRGRRYD! KDWDKGDI I NQCI SAATAC_LYGVT EAAV LAAGYA PAI GFVHSGK P--LSFVYDIADI I KF ETVV PAA FAI AAAR PTDAEKRVRLACR 202 
Y. pestis 191 EAVLTKALYK LAANTV -NYGDFTRAKR -- ---GGG1I DLANR F LDHGNY LAYG LAATATW/I GL PHGLSV LH-GKTRRGGLV FDVADLI KDA LV LPQAF -1 AAMQG EEEQEFR -QRC| 298 
1-F | Y. kriste. 194 EAVLTKALYK LAANTA - ~~~ -KYGDFTRAKR ~~ - - -GGGVD LANR F LDHGNY LAYG LAATATW1I GL PHGLSV LH -GKTRRGGLV FDVADLI KDALVLPQAF-1 AAMRGEEEQEFR -QRC| 301 
H. alvei 192 EAVMTKALYK LASDST -~---QYGEFTRAKR - - - - -GGGT DMANR F LDHGNY LAYG LAAVAAWVTGL PHGLAV LH -GKTRRGGLV FDI ADL! KDALVLPQAF -1 AAMAGDDEQEFR -QRC L 299 
1-A | S. solfat. 207 EA EAAKVYWRGVK SLL PK SLGFKGRMKRV S -DNLDP FNRALNI GYGMLRKVVWGAV | SVGLN PY1GFLHK FRSGRI SLV FDLMEEEFRS PFVDRK LI GLAR ESADKVTD LK ~~~ -- 315 
T. mariti. 148 EGNAREEYY SMI DSLV--SDERFRI EKRTRR------PPKNFANTLI SFGNSLLYTTVLSLIYQTHLD PRI GY LHETNFRR FSLNLDIAELFK PAVVDRLFLNLVNTRQINEKHFD-El SE 209 
1-B | P lettin. 155 EGNAR EI YYHCFDNFV --K SGD FHF EDR SKR -- ----PPONELNALV SFGNSLLYTTTLSEIYKTHLDPRIGFLHTTNDRRFTLNLDI SEI FKPIIVDRLI FTLINRKQIKKSDFH-EITG 266 
D.desulf. 157 EGEIRK| YYDAFNVI L~-NFEDFY FOKRTKR- -PPENPINALI SFGNSLIYTT! LSQIYRTHLD PRI GY LHETNQRSFSLNLDLAEVFKPI1IVDRV1I FSLINKKQIQLKHFE-QEID 268 
D. vulgaris 166 EGEAAKGY FSVFDNLI LTR EAAERF EGR SRR -- - ---PPLDRVNCLLS FI YTLLGHDVR SAL EGVGLD SAVGF LHRDR PGRHGLALDVMEEFRAVVADR LALSLI NLGK LKK SDFEI QETG 280 
1-C | B.halodu. 166 EGQAAI NYNKV FDQMI LQQK EE FA FHGR SRR ~~ -- - -PPKDNVNAMLS FAYT LLANDVAAA LETVGLDAYVG FMHQDR PGRA SLALDLMEELRGLYADRFVLSLI NRK EMTADG FYKK ENG 280 
B.cereus 166 EGQAATSYNK LFNQMI LQQK ED FY FNRR SRR - -PPLDNVNAMLS FAYT LLANDMT SALE SVGLDAYVGF LHRDR PGRV SLALDV 1 EELRGVYADK FV LSLI NKRVINKGD F FQK ENG 280 
Synech. sp. 153 EGLAAR EY FASWQEML -~-GH EWT FTGR FRR ~~ ----PPTDPVNALLSFGYGLLRTQVTAAVHIAGLD PYIGFLHETTRGQPAMI LDLMEEFRA LVADSVV LTVLKQR El QRQDFT -ESLG 263 
1-D | Cyanot sp. 153 EGLAAR EY FACWQDI L ---GDQWK FTGR FRR ------PPTDPVNALLS FGYGLLRTQVTAAAHIAGLD PY 1 GF LHETTRGQPALV LDLMEEFR PLI ADSVV LTV _LKQK EI K PKDFN-ESLG 263 
L.robusta 153 EGIAAR EY FACWLNI L---DE PWS FQGRHHR - - - ---SPSDPMNVLMN FAYGLLR | QVTAAVHIAGLD PY 1GY LHETTRGQPAMVLDLI EEFR PLVADSFLLSLLSHK ELKLSDFS-ESLG 263 
E.coli Six BRARARRRIMHS BRR RRR RR RRBRE DREN 
ad aé a7 as 
E. coli 253 DI FRSSKTLAKLI PLI EDVLAAGE! QP PAPPEDAQPVAI PL PVSLGDAGHRSS -- 305 
1-E | C. dublin. 253 DLFRSGQTLNKLI PLI EEVLSAGGI SPPEPPADAQP PAI PLADSLAEAGERSR -~ 305 
E. amylov. 253 DAFRSGRILGKLI PLI ETVLAAGEI SPPPPPPDAQPVAI PE PQS FGDVGHR SA - 305 
Y. pestis 299 SGFQRT EALDVMI DGIK ETAALC SQV PR = 326 
1-F | Y. kriste. 302 SGFQRT EALDVMI DGIKQTAALLSQVSR 329 
H. alvei 300 TG FORA EV LDTMI ET LQDTAQQLGQSK P- --- 327 
1-A | S. solfat. 316 ---------- TVY SLFSDVK EDE ---1YTQARRLVNAI LND -EEYR PY LAK - 352 
T. mariti. 260 GLMLNDEGK SLFVKNY EQALR ETV FHKK LNRY V SMR SLI KMELHK LEKHLI GE -QV FGSEE--- 319 
1-B | P lettin. 267 GI SLK ENARRT FVQSFEEKLKDTIYHSGLKRKVSFRTLIRTEAYKI EKHI LED-EPY SPY LG -- 327 
D.desulf. 269 FTY LNEKGRQI FIKLFEEKLATTINYKNIG-KVSYRKLIRLECYKLYKHFLKE-DIYK PFI TNW 330 
D. vulgaris 281 AVRMTDDARKALLVAYQKRKQDEI VHPFLNERI PLGLV FHVQAMLMARWLRGDLDGY PP FVWK - 343 
1-C | B.halodu. 281 AVLMTDEARKT FLKAWQTKKQEK I TH PY LGEKMSWG LV PYVQALLLAR F LRGDLD EY PPF LWK - 343 
B. cereus 281 AVIMTDEARKK FI TAWQNKKQEK1 TH PY LGEKI SWGLV PHAQALLLARY LRDDLD EY PPF LWK - 343 
Synech. sp. 264 AFRLTDSATKTFLGAFDRKLSSEFKHPI FNYKCTYRRAI ELOAR LLARH LOEG -VVY EPLVIR - 325 
1-D | Cyanot. sp. 264 AYRLKDDACKVFLSAFDRKLCSEFKHPI FNYKCSYRRAI ELQARLLARHLQEG -I PYEPLIIR- 325 
Lorobusta 264 AYRLKDAGRKI FLEAFERKLNS EFKH PV FGYRCTYRRAI ELQARLLARHLQEN -VVYK PLK I R - 325 
E.coli Ste ARMA $—$$———____— 


ag 


Extended Data Figure 5 | Sequence alignment of Cas] proteins in type I 


CRISPR systems. Sequence alignments of Cas1 from representative 
organisms with type I CRISPR systems. The E. coli sequence is displayed 
at the top. The dots indicate the residues described in this study, with the 


red dots indicating the metal-binding residues. The box highlights the 
non-universal conservation of the E. coli Y22 residue in the 61 region of 
type I CRISPR systems. The secondary structure representations shown are 
for the E. coli Cas1. 


© 2015 Macmillan Publishers Limited. All rights reserved 


a 
eS 
O XX Splayed end length (nts) 
CS 
a eS? 0123 4 5 6 
10 mt oes tes tee teed bed bee | <__ tegration 
= product 
3 |? 
2 | Mam tomb tad foe et tk tase bee te — S.C. 
1.5 | ™ Band X 
Cc 


5 nt 
dsDNA splayed ssDNA 1 ssDNA 2 


“—-= +=" eee eee 337 


int 


Extended Data Figure 6 | Integration of protospacer substrates with 
splayed ends. a, Representative agarose gel of in vitro integration reactions 
using increasing lengths of splayed ends. The average per cent integration 
of three independent experiments is plotted in Fig. 3d. b, Sequences of 
protospacers used in the integration assays in a. c, A 12% denaturing 


Full 
dsDNA 


1 nt 


2 nt 


3 nt 


4nt 


5 nt 


6 nt 


LETTER 


GCCCAATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 
3’ CGGGTTAAATGATGAGCAAGACCACAAAGAGCA 


” CCCAATTTACTACTCGTTCTGGTGTTTCTCGT oy 


3' CGGGTTAAATGATGAGCAAGACCACAAAGAGC 


oy 
4 CCAATTIACTACTCGTTCTGGTGTTTCTCGT 3’ 
3’ CGGGTTAAATGATGAGCAAGACCACAAAGAG 40 


LD 
“4 CAATTTACTACTCGTTCTGGTGTTTCTCGT 3’ 
3’ CGGGTTAAATGATGAGCAAGACCACAAAGA Ne 


iy 
“44 sATTTACTACTCGTICTGGTGTITCTCGT 3' 
3’ CGGGTTAAATGATGAGCAAGACCACAAAG Cry 
C 


Aa 

CATTTACTACTCGTTCTGGTGTTTCTCGT 3° 
3' CGGGTTAAATGATGAGCAAGACCACAAA "o>, 
Ac 


a 

“acy 
TTTACTACTCGTTCTGGTGTTTCTCGT 3’ 

3' En ORI eon ane em 


7c 


polyacrylamide gel of protospacers after incubation with Cas1-Cas2 for 1h 
at 37 °C in integration assay buffer conditions. The indicated DNA substrates 
are radiolabelled at the 5’ end. Supplementary Information contains the full 
images for a and c. nt, nucleotide. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


‘. 
moivey 


0" FO im fad 
Leeds | 


a 


Symmetry 
mate 


b 


TC Structure 
without Mg?* 


Extended Data Figure 7 | Crystallographic packing of the complexbound _ crystal structures, with or without Mg”, shows a slight DNA kink in the 
to Mg”*. a, View of the symmetry mates (grey) contacting the non-catalytic structure bound to Mg”* (dotted box). This region contacts a-helix 7 of a 
Cas1 subunits (green). Catalytic Cas] subunits are shown in blue, Cas2 in symmetry mate, as described in the text. 


yellow and DNA is shown in salmon and red. b, Superposition of our two 


© 2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Summary of X-ray crystallography data collection and refinement 


LETTER 


Without Mg With Mg Splayed substrate 

Data collection 
Space group P24242, P24242, P24242, 
Cell dimensions 

a, b, c (A) 88.02, 120.01, 196.01 75.66, 165.93, 167.26 88.02, 123.01, 196.01 

a, B,y (°) 90, 90, 90 90, 90, 90 90, 90, 90 
Resolution (A) 49.00-3.20 (3.36 —3.20) 46.41-2.95 (3.06-2.95) 48.9-3.35 (3.42-3.35) 
Rmerge (%) 30.8 (146) 19.6 (157) 28.5 (126) 
Roim (Y%) 12.8 (61.4) 10.8 (86.3) 21.6 (94.3) 
Ilo 6.4 (1.5) 9.8 (1.4) 5.0 (1.3) 
CCi/2 98.5 (72.4) 99.3 (42.0) 98.3 (72.7) 
Completeness (%) 99.8 (99.0) 100 (99.9) 99.6 (97.7) 
Redundancy 6.7 (6.6) 7.9 (8.0) 4.1 (4.0) 
Wilson B factor (A’) 63.8 64.0 73.7 
Refinement 
Resolution (A) 49.00-3.20 46.41-2.95 49.00-3.35 
No. reflections 35,808 (3,502) 44,960 (4,418) 31,049 (2885) 
Rwork!Riree 24.2/27.0 23.0/25.4 23.2/27.4 
No. atoms 

Protein 9,375 9,576 9,375 

DNA 1,142 1,142 1,165 

Metal 0 4 0 
Average B-factors (A*) 

Protein 65.9 66.6 86.6 

DNA 76.2 67.2 103.0 

Metal 51.6 
R.m.s deviations 

Bond lengths (A) 0.003 0.003 0.004 

Bond angles (°) 0.72 0.75 0.81 
Ramachandran statistics (%) 

Favored 96.0 95.0 96.0 

Allowed 3.75 4.51 3.58 

Outliers 0.25 0.49 0.42 


One crystal was used for each structure. 
Highest resolution shell is shown in parenthesis. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature15519 


Endoperoxide formation by an a-Ketoglutarate- 
dependent mononuclear non-haem iron enzyme 


Wupeng Yan!**, Heng Song**, Fuhang Song*, Yisong Guo®, Cheng-Hsuan Ww’, Ampon Sae Her?, Yi Pu, 
Shu Wang’, Nathchar Naowarojna*, Andrew Weitz°, Michael P. Hendrich®, Catherine E. Costello**, Lixin Zhang’, 


Pinghua Liu? & Yan Jessie Zhang! 


Many peroxy-containing secondary metabolites’” have been 
isolated and shown to provide beneficial effects to human health?>. 
Yet, the mechanisms of most endoperoxide biosyntheses are not 
well understood. Although endoperoxides have been suggested 
as key reaction intermediates in several cases**, the only well- 
characterized endoperoxide biosynthetic enzyme is prostaglandin 
H synthase, a haem-containing enzyme’. Fumitremorgin B 
endoperoxidase (FtmOx1) from Aspergillus fumigatus is the first 
reported a-ketoglutarate-dependent mononuclear non-haem iron 
enzyme that can catalyse an endoperoxide formation reaction!” !”, 
To elucidate the mechanistic details for this unique chemical 
transformation, we report the X-ray crystal structures of FtmOx1 
and the binary complexes it forms with either the co-substrate 
(a-ketoglutarate) or the substrate (fumitremorgin B). Uniquely, 
after a-ketoglutarate has bound to the mononuclear iron centre in 
a bidentate fashion, the remaining open site for oxygen binding and 
activation is shielded from the substrate or the solvent by a tyrosine 
residue (Y224). Upon replacing Y224 with alanine or phenylalanine, 
the FtmOx1 catalysis diverts from endoperoxide formation 
to the more commonly observed hydroxylation. Subsequent 
characterizations by a combination of stopped-flow optical 
absorption spectroscopy and freeze-quench electron paramagnetic 
resonance spectroscopy support the presence of transient radical 
species in FtmOx1 catalysis. Our results help to unravel the novel 
mechanism for this endoperoxide formation reaction. 

The verruculogen biosynthetic gene cluster was identified through 
bioinformatic analysis'®, and its chemical scaffold is assembled by a 
non-ribosomal peptide synthetase, followed by several tailoring reac- 
tions. Among them, the FtmOx1-catalysed endoperoxide formation 
reaction is the most notable (Fig. 1a). Recent biochemical character- 
izations indicate that, unlike prostaglandin H synthase, FtmOx] is 
an a-ketoglutarate (a-KG)-dependent mononuclear non-haem iron 
enzyme!"”. Further characterization indicates that molecular oxygen 
(O,) is incorporated into verruculogen without O-O bond scission", 
distinguishing FtmOx1 from all currently known a-KG-dependent 
mononuclear non-haem iron enzymes!*""”, 

To unravel the mechanistic details of FtmOx1 catalysis, we first char- 
acterized the FtmOx1-a-KG complex using anaerobically purified and 
Fe?+-reconstituted FtmOx1 (FtmOx1-Fe"). Upon mixing the reconsti- 
tuted enzyme with a-KG under anaerobic conditions, a pink species 
appeared (pink trace, extinction coefficient €529 of ~166M~! cm}, 
Fig. 1b). The dissociation constant (Kg) of this species was 
~185 +35 uM (Extended Data Fig. 1a), close to the Kg values of 
Fe"-a-KG complexes of other mononuclear non-haem iron enzymes 
(for example, that of TauD)'®. Upon exposure to O, and in the absence 
of the substrate fumitremorgin B (1), the pink species faded and a 


blue chromophore with a Aax at ~600 nm developed within 30 min 
(blue trace, Fig. 1b). Tandem mass spectrometry (MS/MS) analysis of 
the blue species indicated the oxidation of Y224 to dihydroxypheny- 
lalanine (DOPA, Fig. 1c), which is the result of a self-hydroxylation 
reaction as observed in other mononuclear non-haem iron enzymes”. 
Notably, the presence of the substrate fumitremorgin B (1) prevented 
the FtmOx1 self-hydroxylation reaction (Extended Data Fig. 2). All 
of these properties are consistent with the formation of the FtmOx1- 
Fe"—a-KG complex. 

In previous studies, ascorbate was included as an additional reduct- 
ant!!!?, although its role in FtmOx! catalysis was not known'!!”, We 
observed that FtmOx1] is capable of catalysing fumitremorgin B (1) 
oxidation in the absence of ascorbate (Fig. 1d and Extended Data 
Fig. 3). Ata fixed FtmOx1 -Fe":fumitremorgin-B ratio of 1:1.5 and with 
an excess of Oz, the amount of product increased with the amount 
of a-KG until the a-KG:FtmOx1-Fe" ratio reached 1.0 (Fig. le and 
Extended Data Fig. 4). In contrast, when the O2:FtmOx1-Fe" ratio 
was below 1.0, only a small amount (~0.2 equivalent) of product was 
formed. Above 1.0, the amount of product increased with the increasing 
amount of O>, and plateaued when the O2:FtmOx1-Fe" ratio was >2.0 
(Fig. 1f and Extended Data Fig. 4). These results strongly suggest that 
each FtmOx1-catalysed turnover consumes one equivalent of a-KG 
and two equivalents of O2. Unexpectedly, under our assay conditions, 
compound 3 rather than verruculogen (2) was the dominant product 
(Fig. le, fand Supplementary Information). 

We determined the FtmOx1 crystal structure at 1.95 A resolution 
with the phase derived from selenomethionine-labelled FtmOx1 using 
the single-wavelength anomalous dispersion method. FtmOx1] folds as 
a ‘jelly roll; a prevalent fold in mononuclear non-haem iron enzymes 
(Fig. 2a)'*. Two molecules in each asymmetric unit form a functional 
dimer, consistent with our size-exclusion chromatography profile 
and previous literature reports!'!*, The dimer interface (2,461.6 A”) 
accounts for 17.1% of the FtmOx1 surface. The active site pocket at the 
dimer interface has a volume of 222.6 A3, as calculated by the DogSite 
Server”’. This spacious pocket is partitioned into two parts: a hydro- 
philic region where the non-haem iron centre is located and a hydro- 
phobic pocket formed by L64, F115, and F233 from one monomer with 
1267 and V268 from the other (Fig. 2b). 

H129, H205, D131, and three well-ordered water molecules form 
an approximate octahedral coordination to the mononuclear iron 
(Fig. 2c). One of the water ligands is hydrogen-bonded to Y224, 
whose proximity to the mononuclear iron centre (~4.4 A) explains 
the formation of DOPA in the FtmOx! self-hydroxylation reaction 
(Fig. 1c). Co-crystallization or soaking of the co-substrate a-KG led 
to an identical FtmOx1-Fe"-a-KG complex, in which a-KG binds to 
the iron centre in a bidentate fashion by replacing two water molecules 


1Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712, USA. @Department of Chemistry, Boston University, Boston, Massachusetts 02215, USA. 3CAS Key 
Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China. “Center for Biomedical Mass Spectrometry, Boston 
University School of Medicine, Boston, Massachusetts 02118, USA. 5Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas 78712, USA. Department of Chemistry, 


Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, Pennsylvania 15213, USA. 
*These authors contributed equally to this work. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 539 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
800 100 + Ye 
— FimOx1 wild-type pink species 90 4 b 44 Al 
600 === FtmOx1 wild-type blue species 804 ee ene ns uv L TPE 
= *Oxidation of tyrosine y 
T 
& 
° by"-H,O Ya 
FtmOx1 fe) = 9 
/ FtmF 2 “ [b‘-H,O12* } 
b, * 
104 
500 600 700 
(0) 
Wavelength (nm) 200 400 600 800 4,000 1,200 1,400 mz 
d 
a f 
7 a-KG:FtmOx1:substrate 7 3 


1 
2:1:1.5 


2 equiv. oxygen 


Wild-type 
FtmOx1 


1.5 equiv. oxygen 


Intensity (300 nm) 
a 
a 
Intensity (300 nm) 


Fumitremorgin B (1) 


1 equiv. oxygen 


16 18 20 22 24 26 28 630 1 8 20 22 24 26 28 30 


Time (min) Time (min) 
Figure 1 | Enzymatic characterization of wild-type FtmOx1. a, Proposed with various amounts of a-KG. Identities of the peaks were assigned 
FtmOx] reaction. b, Formation of the FtmOx1-Fe"-a-KG binary complex based on nuclear magnetic resonance (NMR) and high-resolution mass 
under anaerobic conditions (pink trace). Self-hydroxylation reaction spectrometry (see Supplementary Information). f, O2 stoichiometry analysis 
upon exposure of the binary complex to O2 (blue trace). c, Electrospray in FtmOx1 catalysis. HPLC chromatograms of FtmOx] reactions contained 
ionization MS/MS analysis of the blue species in b is consistent with the fumitremorgin B (360 uM), FtmOx1 (240 uM), and a-KG (480 uM) when 
oxidation of Y224 to DOPA224. d, Products formed in the FtmOx] reaction. variable amounts of oxygen-saturated buffer were added to initiate the 
e, a-KG stoichiometry analysis in FtmOx1 catalysis. High-performance reaction. 


liquid chromatography (HPLC) chromatograms of FtmOx] reactions 


(Fig. 2d). In the FtmOx1-Fe"-a-KG complex, the 2-keto group of On the basis of the strategic positioning of Y224 in the FtmOxl active 
a-KG coordinates to the iron centre trans to D131. Its 1-carboxylate _ site, we next examined the role it plays in FtmOx] catalysis. We char- 
group binds trans to H205, which is the distal histidine of the 2-His-1- acterized two Y224 variants, Y224A- and Y224F-substituted FtmOx1. 
carboxylate facial triad (Fig. 2d and Extended Data Fig. 5a)'*. In this The enzyme-a-KG complexes of both variants exhibit Kj values close 
FtmOx1-Fe"-a-KG complex, the remaining water ligand (a potential __ to that of the wild-type FtmOx1 (Extended Data Fig. 1b, c). However, 
site for O2 binding and activation) is completely shielded from sol- the product profiles of both variants were very different (Fig. 3). The 
vent or substrate by Y224 (Fig. 2d and Supplementary Video 1).In Y224A-substituted variant produced a mixture of at least five detecta- 
contrast, in most reported structures of enzyme-a-KG complexes, the _ ble products with mainly dealkylation products (compounds 4, 5 and 
1-carboxylate of a-KG coordinates trans to the proximal histidine of the 6). Endoperoxides (2 and 3) only account for ~15% of the product 
facial triad motif’, and the remaining open site for Oz binding and acti- population (Fig. 3a, b). 
vation directly points towards the substrate. As a result, the oxoferryl The Y224F-substituted variant also produced endoperoxides (2 
(Fe'’=O) species produced from oxygen activation is accessible tothe and 3) and dealkylation products (4 and 5). In addition, there were 
substrate for oxidative transformations (for example, TauD in Fig.2e more endoperoxides (2 and 3) formed by the FtmOx1(Y224F) variant 
and Supplementary Video Dyn relative to the FtmOx1(Y224<A) variant (~35% versus ~15% of the 
To examine whether Y224 changes location upon substrate product mixture, Fig. 3a). For the FtmOx1(Y224F)-Fe"-a-KG com- 
binding, we also solved the structure of the FtmOx1l-Fe"- plex in the absence of the substrate fumitremorgin B (1), exposure to 
fumitremorgin-B complex at a resolution of 2.2 A (Fig. 2f). The loca- OQ» caused the complex to slowly change colour to blue, which implies 
tion of the positive density is consistent for all data sets collected DOPA formation (Extended Data Fig. 6a). DOPA formation can be 
for this complex by either co-crystallization or soaking (>15 data explained by two sequential hydroxylation steps (F224— Y224 and 
sets). Substrate is modelled into the density at the active site with © Y224—>DOPA224). Indeed, this conclusion was supported by MS/MS 
an average occupancy of ~60% owing to the high hydrophobicity analysis of this variant (Extended Data Fig. 6b-e). Thus, the higher 
of fumitremorgin B (Extended Data Fig. 5b). In this complex, Y224 _ level of endoperoxides (2 and 3) produced by the FtmOx1(Y224F) var- 
adopts a conformation identical to that observed in the FtmOx1 _ iant is probably attributed to the conversion of the variant to wild-type 
alone (Fig. 2c) and FtmOx1-Fe"-a-KG complexes (Fig. 2d). Rings | FtmOx1, which provides further evidence supporting the key role of 
A and B of the substrate form m-m stacking with the Y224 side chain Y224 in FtmOx1 catalysis. 
at a distance of ~3.3 A (Fig. 2f). Superimposition of the struc- Mononuclear non-haem iron enzymes catalyse a wide range of 
tures of the FtmOx1-Fe"-a-KG complex onto the FtmOx1-Fe"— _reactions'*"!”. Recently, unique transformations have been reported 
fumitremorgin-B complex revealed that the side chain of Y224 effec- | which demonstrate the functional versatility of this class of enzymes, 
tively separates the potential O, binding site from the substrate binding including oxidative dehydrogenation in epoxide formation”’, chlo- 
pocket (Fig. 2g, Extended Data Fig. 5c and Supplementary Video 1). _ rination***, epimerization*””®, and C-C bond cleavage”’. FtmOx1 
This is in notable contrast to TauD, in which the oxygen binding and _ provides a further example of this diversity'!’*. On the basis of our 
activation site directly faces the substrate (Extended Data Fig. 5dand structural and biochemical information, we propose a preliminary 
Supplementary Video 2)71. FtmOx1 mechanistic model (Fig. 4). After a-KG and substrate binding, 


540 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


d Gin126 e 


SS 


Arg218 Cys 
boxy! 
a-ketoglutarate 


His99 


His129 \His255 


Pom 


1-carboxylate 
Asp101 


a-ketoglutarate 


Tyr224 


\ His129 


Figure 2 | Structures of FtmOxt1. a, Overall architecture of FtmOx1 shown 
as a functional dimer with one monomer colour-coded based on secondary 
structures (shown as stereo images). The iron centre is labelled as a grey 
sphere. b, FtmOx] active site shown in the electrostatistic mode. c, FtmOx1 
metallo-centre electron density (2mF,— DF, map) at 1o contour. The 
coordination of iron is represented by dashed lines. d, FtmOx1-Fe"-a-KG 
binary complex. The a-KG molecule was modelled into a composite omit map 
(mF, — DF, map) contoured to 2.80. The coordination of iron is represented 
by dashed lines with distances labelled (units, A).e, a-KG binding mode of 
TauD (PBD accession code 10S7). TauD is shown in an identical orientation 
relative to that of FtmOx1 in d to highlight their differences in active site 
topologies. f, Structure of the FtmOx1-Fe"-fumitremorgin-B complex. 

g, Superimposition of the binary structures of FtmOx1-Fe"-a-KG and 
FtmOx1-fumitremorgin-B. Y224 is highlighted in pink. 


the first molecule of O; is activated to produce an Fe'’=O species (spe- 
cies B) by a mechanism similar to that of other members of this class 
of enzymes”®. Uniquely, in FtmOxl, because the O, activation site is 
shielded from the substrate by Y224, direct oxidation of the substrate 
by the Fe'’ =O species is less likely. Instead, the Fe'’=O species oxidizes 
Y224 to a tyrosyl radical (species C), which then removes a hydrogen 
atom from the fumitremorgin B C21 position to form a substrate-based 
radical (species D). A second molecule of O; reacts with species D to 
form a peroxyl radical (species E). It then reacts with the other prenyl 
arm to produce the endoperoxide along with the formation ofa carbon 


LETTER 


Y224F-substituted FtmOx1 


T T 
8 10 12 14 16 18 20 22 
Time (min) 


A 
0 " FtmOxt Y224 
mutation 
variants 


Fumitremorgin B (1) 


Figure 3 | Characterization of Y224A- and Y224F-substituted FtmOx1. 
a, HPLC profiles of reactions from the two FtmOx] variants, Y224A and 
Y224F. Both traces were conducted in reaction mixture containing FtmOx1 
Y224 variants (240 uM), fumitremorgin B (240 1M), a-KG (720 uM) and 
Op (480 1M). Trace I, HPLC chromatograms of the reaction using Y224A- 
substituted FtmOx1; trace II, HPLC chromatograms of the reaction using 
Y224F-substituted FtmOx1. Note that a new column was used for the mutant 
analyses relative to the one in wild-type FtmOx1 characterizations, which 
led to the differences in retention times relative to the other HPLC traces. 
b, Products formed in reactions using either Y224F- or Y224A-substituted 
FtmOx] variants. The compounds were characterized using NMR and MS 
(see Supplementary Information). 


centre radical at the C26 position (species F). Species F can re-oxidize 
Y224 to a tyrosyl radical (species G). Starting from species G, two path- 
ways are possible. FtmOx! can follow a mechanism similar to prosta- 
glandin synthase H (ref. 9) in which once the tyrosyl radical is formed, 
multiple cycles of endoperoxide formation can be mediated through 
this radical (pathway I, Fig. 4). However, the production of compound 3 
in the FtmOx1 reaction points to another possibility (pathway II, 
Fig. 4), in which the two electrons provided by the 2—3 oxidation 
process reduce both Fe** and the tyrosyl radical to the resting state of 
FtmOx!1 (species A). 

The formation of a small amount of endoperoxides in Y224A- and 
Y224F-substituted FtmOx1 may be due to two competing pathways: 
the hydroxyl-rebound and the endoperoxide formation pathways 
(Extended Data Fig. 7). After Fe'’=O is formed, it may directly remove 
a hydrogen atom from the fumitremorgin B (1) C21 position to forma 
substrate-based radical (species C’, Extended Data Fig. 7). Subsequent 
rebound by the hydroxyl radical will lead to the formation of hydrox- 
ylation products (pathway I’, Extended Data Fig. 7). Decomposition 
of the hydroxylation reaction product forms compounds 4 and 5. At 
the same time, the substrate-based radical (species C’) may be trapped 
by a second molecule of O2, which leads to endoperoxide formation 
(pathway II’, Extended Data Fig. 7). 

To gain evidence supporting the presence of radical species in FtmOx1 
catalysis as outlined in our FtmOx1 mechanistic model (Fig. 4), we 
conducted spin-trapping experiments using 5,5-dimethyl-1-pyrroline 
n-oxide (DMPO) as the reagent. In the presence of 50 equivalents of 
DMPO, further oxidation of verruculogen 2 to 3 was markedly sup- 
pressed, and 2 was the dominant product (Fig. 5a). This result provides 
evidence supporting the involvement of radicals in FtmOx] catalysis. 
Next, the FtmOx1 reaction was monitored with stopped-flow optical 
absorption spectroscopy (Fig. 5b). The UV-visible spectrum of the 
solution generated after rapid mixing of O2-saturated buffer with 
the FtmOx1-Fe"-fumitremorgin-B-a-KG complex demonstrated the 
accumulation of a transient species centred at ~420nm. The amount 


26 NOVEMBER 2015 | VOL 527 | NATURE | 541 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


KG oO 


Yo04 “ 2 
substrate W774 
” oT 
OH = 
Hi29 
H. OH 
205, beer B 
De | Nou 
129 
A 
Pathway CHO 


=F 


OH 
Hoos, | 0. CO, 


J Compound 1 
A 


260 benwey 


Figure 4 | Proposed FtmOx1 mechanistic model. The oxygen-oxygen bonds shown in blue highlight the incorporation of endoperoxide into the substrate 


fumitremorgin B. 


of this species maximized at ~0.2s and then decayed within ~3 s. This 
wavelength differs from the tyrosyl radicals observed in ribonucleo- 
tide reductase” and another reported a-KG-dependent iron enzyme, 
CarC”® (a peak at 410 nm with a shoulder at 390 nm). Chemical quench 
experiments performed under the same conditions indicated that the 
consumption of substrate 1 and the formation of products (2 and 3) 
occurred on the timescale of seconds per cycle (Extended Data Fig. 8a), 
suggesting that the 420 nm species observed in the stopped-flow optical 
absorption spectroscopy experiments is a kinetically competent inter- 
mediate. FtmOx1 catalysis was then investigated by rapid freeze-quench 


0.8 


Substrate 


0.6 


Absorption 


20 22 24 26 28 30 
Time (min) 


Figure 5 | Evidence for transient radical species in the reaction pathway. 

a, HPLC chromatograms of FtmOx] reaction under three different 
conditions. The reaction mixture contained FtmOx1 (240 uM), 
fumitremorgin B (200 uM), a-KG (300 uM), and was initiated with O- 
saturated buffer. Trace I, FtmOx1 reaction; trace II, FtmOx] reaction in the 
presence of 10 mM DMPO; trace III, FtmOx1 substrate alone. b, Absorbance 
changes upon mixing the O2-saturated buffer with the reaction mixture 

in 100mM Tris-HCl (pH 7.5) buffer containing FtmOx1 (0.65 mM), Fel! 
(0.58 mM), fumitremorgin B (0.58 mM) and a-KG (12 mM). The decay of the 
Fe-a-KG complex charge transfer band centred at ~520 nm (dashed arrow) 
and the formation and decay of the spectral feature centred at ~420 nm 


542 | NATURE | VOL 527 | 26 NOVEMBER 2015 


in conjunction with electron paramagnetic resonance (EPR) spectros- 
copy. Two EPR signals were observed at 0.01 s (earliest possible time 
on instrument) and were highest at ~0.2s after the rapid mixing of 
O,-saturated buffer with the FtmOx1-Fe"-fumitremorgin-B-a-KG 
complex (Extended Data Fig. 9a). The first EPR signal with resonances 
at g= 4.54, 4.26, and 3.93 (Extended Data Fig. 9b) belongs to a high- 
spin Fe** species having axial and rhombic zero-field splitting param- 
eters of |D| <0.5cm7! and E/D~ 0.26, respectively. These parameters 
are not typical of adventitious Fe**. The second EPR signal was in 
the g=2 region and most likely belongs to a radical species (Fig. 5c 


0.3 
0.25 


420 £ 02 


i 


3 0.1 


0.05 


Signal 


0.001 0.01 01 1 10 
Time (ns) 


350 400 450 500 550 600 650 330 335 340 345 350 355 360 


Wavelength (nm) Magnetic field (mT) 


(arrow) are highlighted. Inset: time-dependent absorbance change at 420 nm. 
The absorbance reported in b was obtained by blanking the spectrometer 
with the anaerobic buffer containing 100 mM Tris-HCl (pH 7.5). The 
absorbance reported in the inset was obtained by subtracting the absorbance 
at 420 nm of the 2 ms spectrum from all other spectra recorded. The trace 

is the average of two trials. c, Spectroscopic evidence for transient radical 
species. X-band EPR spectra measured at 19 K of reaction samples freeze- 
quenched at the indicated time points. Measurement conditions: microwave 
frequency, 9.64 GHz; microwave power, 2 1.W; modulation amplitude, 1 mT; 
and modulation frequency, 100 kHz. 


© 2015 Macmillan Publishers Limited. All rights reserved 


and Extended Data Fig. 9c). The formation and decay of this radical 
signal closely followed the kinetics of the 420 nm absorption feature 
observed in stopped-flow optical absorption spectroscopy experi- 
ments (Extended Data Fig. 8b), indicating that they are from the same 
intermediate species. Spin quantification of the EPR signals at ~0.2s 
revealed that the Fe** and radical species accumulated to ~0.35 and 
~0.25 equivalents, respectively. The width of the radical EPR signal 
(~12 mT edge-to-edge width, Fig. 5c) was significantly broader than 
that of magnetically isolated organic or protein radical signals”’. Such 
broadening could be due to a magnetic dipolar interaction of the rad- 
ical species with an adjacent spin centre, most likely the Fe** centre 
depicted in species D, E or F in Fig. 4. 

In summary, our FtmOx1 structural and biochemical characteri- 
zation provides a notable example of the catalytic versatility of mono- 
nuclear non-haem iron enzymes and how changes in the secondary 
coordination sphere to the non-haem iron facilitate unprecedented 
chemical transformations. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 15 May; accepted 24 August 2015. 
Published online 2 November 2015. 


Casteel, D. A. Peroxy natural products. Nat. Prod. Rep. 9, 289-312 (1992). 
Casteel, D. A. Peroxy natural products. Nat. Prod. Rep. 16, 55-73 (1999). 
Chaturvedi, D., Goswami, A., Saikia, P. P.,, Barua, N. C. & Rao, P. G. Artemisinin 
and its derivatives: a novel class of anti-malarial and anti-cancer agents. Chem. 
Soc. Rev. 39, 435-454 (2010). 

4. Paddon, C. J. & Keasling, J. D. Semi-synthetic artemisinin: a model for the use 
of synthetic biology in pharmaceutical development. Nature Rev. Microbiol. 12, 
355-367 (2014). 

5. Dembitsky, V. M. Bioactive peroxides as potential therapeutic agents. Eur. J. 
Med. Chem. 43, 223-251 (2008). 

6. Widboom, P. F., Fielding, E. N., Liu, Y. & Bruner, S. D. Structural basis for 
cofactor-independent dioxygenation in vancomycin biosynthesis. Nature 447, 
342-345 (2007). 

7. Steiner, R. A., Janssen, H. J., Roversi, P., Oakley, A. J. & Fetzner, S. Structural 
basis for cofactor-independent dioxygenation of N-heteroaromatic compounds 
at the a/B-hydrolase fold. Proc. Nat! Acad. Sci. USA 107, 657-662 (2010). 

8. Thierbach, S. et al. Substrate-assisted Oz activation in a cofactor-independent 
dioxygenase. Chem. Biol. 21, 217-225 (2014). 

9. Marnett, L. J. Cyclooxygenase mechanisms. Curr. Opin. Chem. Biol. 4, 545-552 

(2000). 

0. Grundmann, A. & Li, S. M. Overproduction, purification and characterization of 
FtmPT1, a brevianamide F prenyltransferase from Aspergillus fumigatus. 
Microbiology 151, 2199-2207 (2005). 

1. Steffan, N., Grundmann, A., Afiyatullov, S., Ruan, H. & Li, S. M. FtmOx1, a 
non-heme Fe(II) and a-ketoglutarate-dependent dioxygenase, catalyses the 
endoperoxide formation of verruculogen in Aspergillus fumigatus. Org. Biomol. 
Chem. 7, 4082-4087 (2009). 

2. Kato, N. et a/. Gene disruption and biochemical characterization of 
verruculogen synthase of Aspergillus fumigatus. ChemBioChem 12, 711-714 
(2011). 

3. Clifton, IJ. et a/. Structural studies on 2-oxoglutarate oxygenases and related 
double-stranded a-helix fold proteins. J. Inorg. Biochem. 100, 644-669 (2006). 

4. Hausinger, R. P. Fe(Il)/a-ketoglutarate-dependent hydroxylases and related 
enzymes. Crit. Rev. Biochem. Mol. Biol. 39, 21-68 (2004). 

5. Costas, M., Mehn, M. P, Jensen, M. P. & Que, L. Dioxygen activation at 

mononuclear nonheme iron active sites: Enzymes, models, and intermediates. 

Chem. Rev. 104, 939-986 (2004). 


WN 


LETTER 


16. Solomon, E. |. et a/. Geometric and electronic structure/function correlations in 
non-heme iron enzymes. Chem. Rev. 100, 235-350 (2000). 

17. Kovaleva, E. G. & Lipscomb, J. D. Versatility of biological non-heme Fe(II) 
centers in oxygen activation reactions. Nature Chem. Biol. 4, 186-193 
(2008). 

18. Ryle, M. J., Padmakumar, R. & Hausinger, R. P. Stopped-flow kinetic analysis of 
Escherichia coli taurine/a-ketoglutarate dioxygenase: interactions with 
a-ketoglutarate, taurine, and oxygen. Biochemistry 38, 15278-15286 (1999). 

19. Liu, A, Ho, R. Y. N. & Que, L. Alternative reactivity of an a-ketoglutarate- 
dependent Iron(Il) oxygenase: enzyme self-hydroxylation. J. Am. Chem. Soc. 
123, 5126-5127 (2001). 

20. Volkamer, A., Kuhn, D., Grombacher, T., Rippmann, F. & Rarey, M. Combining 
global and local measures for structure-based druggability predictions. 

J. Chem. Inf. Model. 52, 360-372 (2012). 

21. Elkins, J. M. et a/. X-ray crystal structure of Escherichia coli 
taurine/a-ketoglutarate dioxygenase complexed to ferrous iron and substrates. 
Biochemistry 41, 5185-5192 (2002). 

22. Liu, P. et a/. Protein purification and function assignment of the epoxidase 
catalyzing the formation of fosfomycin. J. Am. Chem. Soc. 123, 4619-4620 
(2001). 

23. Vaillancourt, F. H., Yeh, E., Vosburg, D. A., O'Connor, S. E. & Walsh, C. T. Cryptic 
chlorination by a non-haem iron enzyme during cyclopropyl amino acid 
biosynthesis. Nature 436, 1191-1194 (2005). 

24. Blasiak, L. C., Vaillancourt, F. H., Walsh, C. T. & Drennan, C. L. Crystal structure 

of the non-haem iron halogenase SyrB2 in syringomycin biosynthesis. Nature 

440, 368-371 (2006). 

25. Clifton, |. J. et a/. Crystal structure of carbapenem synthase (CarC). J. Biol. 

Chem. 278, 20843-20850 (2003). 

26. Chang, W. C. et al. Mechanism of the C5 stereoinversion reaction in the 

biosynthesis of carbapenem antibiotics. Science 343, 1140-1144 (2014). 

27. Blodgett, J. A. V. et a/. Unusual transformations in the biosynthesis of the 

antibiotic phosphinothricin tripeptide. Nature Chem. Biol. 3, 480-485 (2007). 

28. Krebs, C., Galonic Fujimori, D., Walsh, C. T. & Bollinger, J. M., Jr. Non-heme 

Fe(IV)-oxo intermediates. Acc. Chem. Res. 40, 484-492 (2007). 

29. Stubbe, J. & van der Donk, W. A. Protein radicals in enzyme catalysis. Chem. 
Rev. 98, 705-762 (1998). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank H.-w. Liu, S. Elliott and A. Liu for comments on the 
manuscript. We also thank R. Fan and J. Lee for assistance with the pre-steady 
state kinetics studies, J. Caradonna for use of stopped-flow instruments, and 

A. Monzingo for assistance with crystallography software. This work is supported 
in part by grants from the National Institutes of Health (RO1 GM093903 to PLL; 
P41 GM104603 to C.E.C.; RO1 GM104896 to YJ.Z.; and RO1 GM077387 to 
M.P.H.), the National Science Foundation (CHE-1309148 to P.L.; CHE-1126268 
for the EPR spectrometer), the Welch Foundation (F-1778 to Y.J.Z.), the 973 
program (2013CB734000 to L.Z), and Y.G. acknowledges financial support from 
Carnegie Mellon University. Crystallographic data collection was conducted at 
advanced light sources (Beamline 5.0.3) and advanced photon sources (BL23- 
ID-B), Department of Energy (DOE) National User Facility. L.Z. is an awardee of 
the National Distinguished Young Scholar Program in China (31125002). 


Author Contributions PL., Y.J.Z. and L.Z. designed the study. W.Y. conducted the 
crystallization experiments and structure determination. H.S., F.S., A.S.H., S.W. 
and N. N. conducted the biochemical studies. C.-H.W., Y.P. and C.E.C. performed 
the MS-MS analyses. H.S., Y.G., M.P.H. and A.W. conducted the pre-steady state 
kinetics and EPR characterization. The manuscript was written by P.LL., Y.J.Z. and 
L.Z. with input from all contributing authors. 


Author Information The structural factors and coordinates of FtmOx1 and its 
complexes with either a-KG or fumitremorgin B have been deposited in the 
Protein Data Bank with accession codes 4Y5T, 4Y5S and 4ZON. Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare no competing financial interests. Readers are welcome to comment 
on the online version of the paper. Correspondence and requests for materials 
should be addressed to PL. (pinghua@bu.edu) or Y.J.Z. (jzhang@cm.utexas. 
edu) or L.Z. (IzhangO3@gmail.com). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 543 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. 

Materials and experimental procedures. Fumitremorgin B was isolated from 
Aspergillus fumigatus strain IM-MF330 according to the procedure summarized 
in a later section. All reagents were purchased from Sigma-Aldrich unless other- 
wise stated. 

Nuclear magnetic resonance (NMR) spectra were obtained on a Bruker Avance 

DRX600 spectrometer in the solvents indicated and referenced to residual 'H and 
8C signals in deuterated solvents. High-resolution electrospray ionization (ESI) 
mass spectrometry (MS) measurements were obtained on a Bruker micrOTOF 
mass spectrometer. High-performance liquid chromatography (HPLC) was per- 
formed using an Agilent 1200 Series separations module equipped with Agilent 
1200 Series diode array detectors and an Agilent 1200 Series fraction collector, 
controlled using ChemStation. UV-vis analysis was performed on a Varian Cary 
100 Bio UV-vis spectrophotometer. 
Sub-cloning and overexpression of wild-type, Y224F- and Y224A-substituted 
FtmOx1. The coding sequence of the FtmOx1 gene from A. fumigatus Af293 
(accession number: XM_742088) was sub-cloned into the EcoRI and XhoI 
restriction sites of the pASK-IBA3* vector, which places it under the control 
of the tet-promoter and allows for the production of C-terminally strep-tagged 
FtmOx1. The final recombinant FtmOx1 includes some extra amino acid residues 
at the N terminus and a strep-tag at the C terminus for purification. The residue 
numbering used in this manuscript is based on the FtmOx1 sequence deposited 
in GenBank (accession number: XM_742088). Y224F- and Y224A-substituted 
FtmOx1 were generated using a Stratagene QuikChange II kit according to the 
manufacturer's instructions. 

Plasmids encoding wild-type, Y224F-, and Y224A-substituted FtmOx1 mutant 
genes were used to transform Escherichia coli BL21 (DE3) cells (Invitrogen Inc.) 
for protein overexpression. A single colony was used to inoculate a starter cul- 
ture, which was incubated at 37°C overnight. Production cultures were grown 
at 37°C in Luria-Bertani medium supplemented with 100g ml”! ampicillin to 
an optical density (OD¢o9) of ~0.8 and then cooled to 25°C. The FtmOx1 protein 
production was induced by the addition of anhydrotetracycline to a final con- 
centration of 250g 1~!. The cultures were grown at 25°C for an additional 16 h 
before harvesting. 

Purification was performed at 4°C. In a typical purification, ~30 g wet cell paste 
was resuspended in 100 ml of anaerobic buffer (100 mM Tris-HCl, 50 mM NaCl, 
and 5 mM 1,10-phenanthroline (pH 7.5)) in an anaerobic Coy chamber. Lysozyme 
(1.0mg ml! final concentration) and DNase I (100 U per gram of cell) were then 
added into the cell suspension, and the mixture was incubated on ice for 40 min 
with gentle agitation. The cells were disrupted by sonication (20 cycles of 10s 
bursts) using a Fisher Scientific Model 505 Sonic Dismembrator. The supernatant 
and the cell debris were anaerobically separated by centrifugation at 4°C for 30 min 
at 20,000g. Streptomycin sulfate was added into the supernatant (~100 ml) toa 
final concentration of 1% (w/v), and the mixture was incubated on ice for 30 min 
with gentle agitation. The DNA precipitate was then removed by centrifugation at 
20,000g for 40 min at 4°C. The resulting supernatant was mixed with Strep-Tactin 
resin (50 ml) and incubated on ice for 30 min. After the cell lysate was drained 
by gravity, the column was washed with washing buffer (100 mM Tris-HCl and 
150mM NaCl (pH 7.5)) until the OD269 was <0.05. The FtmOx1 protein was 
reconstituted by incubating the protein-loaded resin with 50 ml of a solution con- 
taining 3.0mM ammonium ferrous sulfate and 5.0 mM ascorbate at 4°C for 10 min. 
After the excess solution was drained by gravity, the resin was further washed with 
washing buffer until the OD2g9 was <0.05. Recombinant FtmOx1 was eluted with 
elution buffer (2.5mM desthiobiotin in 100 mM Tris-HCl and 50mM NaCl (pH 
7.5)). The eluted protein was concentrated, flash frozen with liquid nitrogen, and 
stored at —80°C. From 30g of wet cell paste, ~400 mg of protein was obtained. The 
purity of the protein was shown by SDS-PAGE (12%) as a single band. The FtmOx1 
concentration was calculated using é2g) nm of 43,288 M~! cm~! determined by 
amino acid analysis. 

Selenomethionine-incorporated FtmOx1 was prepared using a modified 
medium. A single colony was used to inoculate 50 ml Luria-Bertani medium 
supplemented with 100 pgml! ampicillin, which was incubated at 37°C until the 
ODge00 was ~0.5. The pre-culture (2 ml) was transferred into 150 ml of minimal 
media (1 1 minimal media contained 50 ml glycerol, 12.8 g Nagy HPO4-7H20, 3 g 
KH POg, 0.5g NaCl, 1 g NH4Cl, 0.2% glucose, 0.1 mM CaCh, and 2.0 mM MgSO.) 
supplemented with 100 ug ml~! ampicillin, which was incubated at 37°C for an 
additional 5h. Then, 10 ml of the pre-culture was transferred into 1 1 minimal 
media supplemented with 100 ug ml“! ampicillin and incubated at 37°C for 12h. 
Subsequently, 10 ml 100 amino acid solution mix (100 amino acid solution 
contained 100 mg lysine, 100 mg threonine, 100 mg phenylalanine, 50 mg leucine, 
50 mg isoleucine, and 50 mg valine in 10 ml H,O) and 100 selenomethionine 
solution (60 mg L-selenomethionine in 10 ml HO) were added to the culture 


medium. After 0.5h, the temperature was decreased to 25°C, and FtmOx1 over- 
expression was induced by the addition of anhydrotetracycline to a final concen- 
tration of 250g 1-1. The cultures were grown at 18°C for an additional 12h before 
harvesting. 

Selenomethionine-incorporated FtmOx1 was purified according to the same 
procedure described earlier. From 5 g of wet cell paste, ~15 mg of selenomethionine- 
incorporated FtmOx1 was obtained. 

Before crystallization, FtmOx1 was further purified by gel filtration (Superdex 
200, GE Healthcare) in buffer containing 100 mM Tris-HCl at pH 7.5 and 50 mM 
NaCl. After gel filtration, FtmOx1 was concentrated to ~10 mg ml and stored at 
—80°C for future crystallization experiments. 

Isolation of fumitremorgin B. Aspergillus fumigatus strain IM-MF330 was 
isolated from a mud sample collected from the Yellow Sea. A small number of 
spores growing on a potato dextrose agar slant was inoculated into a 250-ml 
conical flask containing 40 ml of liquid medium (20% potato infusion, 2.0% glu- 
cose, 3.5% sea salt, and distilled water) and then cultured at 28°C for 3 days on 
a rotary shaker at 160 rpm. The seed culture (5 ml) was inoculated into 1,000-ml 
conical flasks, each containing 130 g rice and 80 ml artificial seawater, and incu- 
bated without aeration for 19 days. The fermentation product was exhaustively 
extracted with EtOAc:MeOH (80:20) to yield a crude extract. The crude extract 
was partitioned between EtOAc and H,0. The EtOAc layer (10.4 g) was applied 
to a column of silica gel using a gradient solvent system of 50-100% petroleum 
ether/CH,Cl, and 0-100% MeOH/CH>Cl, to afford 15 fractions. Fraction MF330F 
was passed through a Sephadex LH-20 column and eluted with petroleum-ether: 
CH)Cly:MeOH (5:5:1) to yield 5 sub-fractions. The third fraction MF30F3 was 
subsequently subjected to HPLC fractionation (Agilent Zorbax SB-C18 5 um 
250 x 9.4mm column, 3.0 ml min~', 65% MeOH) to yield verruculogen and 
fumitremorgin B, respectively. 

Crystallization and data collection. FtmOx1 crystallization was set up using 
the sitting-drop vapour diffusion method by mixing protein and crystallization 
buffer (100 mM MES (pH 6.5), 50mM CoCh, and 2M ammonium sulfate) at a 
ratio of 2:1 at room temperature. Sheet-like crystals were visible after 7 days. The 
FtmOx] and a-ketoglutarate (a-KG) complex was obtained using both soaking 
and co-crystallization methods, which led to identical models. Crystal soaking 
was conducted by transferring the pre-formed FtmOx1 crystals into crystallization 
mother liquor containing 1 mM a-KG and incubated for 2h at room temperature. 
Co-crystallization trials included pre-mixture of the protein with a-KG at a ratio 
of 1:100 for 2h before crystallization setup. The crystals were cryoprotected by the 
addition of 25% glycerol in mother liquor before being vitrified in liquid nitro- 
gen for data collection. To obtain the structure of the FtmOx1-fumitremorigin-B 
complex, we crystallized FtmOx1 in an anaerobic chamber using identical con- 
ditions. Sheet-like crystals appeared within 3 days and continued to grow for 
another week before reaching maximal size. Fumitremorgin B was dissolved in 
buffer containing degased crystallization mother liquor with 0.05% TritonX-100 
and 20% glycerol to saturation. After centrifugation, to discard insoluble material, 
the mother liquor containing a saturating amount of fumitremorgin B was used to 
soak FtmOx] crystals as sitting drops for 90 min until cryoprotected with degassed 
mother liquor with 30% glycerol. 

Crystal diffraction data were collected at the Advanced Photon Source beam- 
line BL23-ID-B (Argonne, Illinois) for FtmOx1 (wavelength 0.97931 A) and 
selenomethionine-incorporated FtmOx]1 (wavelength 0.97958 A). The diffraction 
data for the FtmOx1-a-KG and FtmOx1-fumitremorgin-B binary complexes were 
collected at the Advanced Light Source beamline BL5.0.3 (Berkeley, California) at 
wavelength 0.97648 A. All data collection were conducted within liquid nitrogen 
stream at 100K. The data were processed using the program HKL2000*”. The 
statistics for data collection are summarized in Extended Data Table 1. 
Structure determination and refinement. The FtmOx1 structure was deter- 
mined by the single anomalous dispersion method using the selenomethionine 
data set with phase information to 3.5 A resolution. The positions of the sele- 
nium were determined and refined by Phenix.Autosol*!~** followed by the density 
modification program DM in CCP4 suite**. An initial model was built based 
on the phase information using the Buccaneer program‘, further extended 
and corrected manually by the COOT program“. The resolution was extended 
to the high-resolution limit of 1.95 A using the native protein data set. Iterative 
cycles of optimization were performed to improve the quality of the model using 
the refinement program PHENIX.Refine?°°, followed by manual rebuilding in 
COOT*. A portion of the of diffraction data (5%) was reserved as an unbiased test 
set for cross validation (Rfee) for the model that eventually had an Ryork of 16.1% 
and an Rfee of 19.9%. The structure of the FtmOx1l-a-KG binary complex and 
FtmOx1-fumitremorgin-B complex were both solved by molecular replacement 
with the FtmOx] structure as the initial model using Phaser in the CCP4 pack- 
age*®°*>°, The co-substrate a-KG and substrate fumitremorgin B were built using 
COOT followed by several rounds of refinement by PHENIX.Refine*?>. For the 


© 2015 Macmillan Publishers Limited. All rights reserved 


FtmOx1-a-KG complex, the final Ryork was 20.3% with an Rfree of 26.0%. For the 
FtmOx1-fumitremorgin-B complex, the final Ryork was refined to 16.7% and Réree 
to 20.3%. Model quality for all of the structures was evaluated with Ramachandran 
and MolProbity*!. The structures show no outlier and most residues were in the 
favoured region of Ramachandran statistics (98.6% for apo FtmOx1, 98.1% for 
the FtmOx1-a-KG complex and 98.3% for FtmOx1-fumitremorgin-B). When 
evaluated by MolProbity, all three structures rank 100% within the specific resolu- 
tion range (apo FtmOx1 with a MolProbity score of 1.04, FtmOx1-a-KG complex 
1.26 and FtmOx1-fumitremorgin-B 1.12, respectively). Refinement statistics are 
summarized in Extended Data Table 1. Figure 2 and Extended Data Fig. 5 were 
prepared with PyMol®. 

Oxygen concentration determination for oxygenated buffer. Oxygen-saturated 
buffer®! (10 ml) was transferred into syringes (12 cm?) with a long needle, and 
the syringe was then sealed. An alkaline KI solution (2.1 M KI and 8.7 M KOH 
prepared using oxygen-free water) and MnSO, solution (2.1 M) in oxygen-free 
water were prepared in a Coy chamber and transferred out using syringes sealed 
by a rubber septum. The alkaline KI solution (0.2 ml) and the MnSO, solution 
(0.2 ml) were quickly aspirated into the syringe containing the 10 ml oxygen satu- 
rated buffer. Then, the syringe was quickly sealed again. The sample in the syringe 
was intensely mixed (turning the syringe ~10 times upside-down until the entire 
syringe was filled with the floating Mn(OH); precipitate); the Mn(OH)); precipitate 
formed completely in 45 min according to the following reaction: 


4Mn** + O, + 80H” +2H,O — 4Mn(OH), | (1) 


After 45 min, H2SO, solution (0.2 ml, 2.7 M) was aspirated into the syringe, 
and Mn** ions oxidized iodide to iodine under acidic conditions according to 
the following reactions: 


2Mn(OH);(s) + 3H,SO, > 2Mn** + 3807° + 3H,O (2) 


2Mn** + 21” > 2Mn** +1, (3) 
Iodine eventually formed I;~ ions with the excess KI: 
,+T +1" (4) 


The resulting iodine solution was transferred to a sample bottle and immediately 
titrated with standardized 2.5 mM Na,S03 solution: 


I, +28,03° 31 +$,027 (5) 


According to reactions (1)-(5), one equivalent of oxygen molecule corres- 
ponds to four equivalents of Na2S.O3. Therefore, the oxygen concentration in the 
oxygen-saturated buffer was determined based on the amount of standardized 
2.5mM Na2S2O3 solution used for titration. 

The Na2S203 concentration was standardized with an iodine solution, which 
was prepared by mixing a standard KIO; solution and KI solution under acidic 
conditions 


(KIO, + 5KI+ 6H* — 31, +6K*+3H,O; L4+1 -1,; 
I; +28,037 > 3 +$,027) 


a-KG and oxygen stoichiometries in the FtmOx1 catalysis. To examine whether 
FtmOx1 is capable of catalysing verruculogen oxidation in the absence of other 
reductants, the FtmOx1 reaction was conducted under the following condi- 
tions: a 200 ul anaerobic mixture in 100 mM Tris-HCl (pH 7.5), contained fumi- 
tremorgin B (360M), a-KG (4mM), and variable amounts of iron-loaded FtmOx1 
(0.25x,0.5x, 1x, and 2.0 of iron-loaded FtmOx! relative to the fumitremorgin 
B concentration). The reaction was initiated by quickly mixing the above solu- 
tion with 200 ul oxygen-saturated buffer (1.2 mM) in the Coy chamber to make a 
solution containing 600 uM oxygen. The final reaction mixture contained 180 1M 
fumitremorgin B, 2mM a-ketoglutarate, 600 uM oxygen, and a variable amount of 
iron-loaded FmOx] (0.25x,0.5x, 1x, and 2x of FtmOx! relative to that of verruc- 
ulogen concentration). After the reaction was initiated, the reaction mixture was 
sealed and incubated for 0.5h at 37°C. The enzymatic reaction was quenched by 
adding 300 ul chloroform, the precipitated protein was removed by centrifugation 
at 13,000g for 10 min, and the chloroform layer was carefully removed. The reaction 
mixture was extracted one more time using a second 300-l volume of chloroform. 
The combined chloroform layers were concentrated by rotatory evaporation, and 
the residue was re-dissolved in 100 ul acetonitrile and subjected to HPLC analysis. 

To determine a-KG stoichiometry, 200 ul anaerobic reaction mixture 
(in 100 mM Tris-HCl (pH 7.50)) contained 360 uM fumitremorgin B, 240 uM 


LETTER 


iron-loaded FtmOx1, and variable amounts of a-KG. The concentration of a-KG 
was varied to make reaction mixtures containing 0.5x, 1.0x,1.5x, and 2.0x of 
a-KG relative to the FtmOx1 concentration. The reaction was initiated by quickly 
mixing 200 ul of oxygen-saturated buffer (1.2 mM) in the Coy chamber to make a 
final oxygen concentration of 600 uM. The resulting reaction mixtures contained 
a final concentration of 180 4M fumitremorgin B, 120 uM iron-loaded FtmOx1, 
600 uM oxygen, and variable amounts of a-KG. After initiation, the reaction was 
sealed and incubated for 0.5h at 37°C. The enzymatic reaction was quenched 
by adding 300 ul chloroform, the precipitated protein was removed by centrif- 
ugation at 13,000g for 10 min, and the chloroform layer was carefully removed. 
The reaction mixture was extracted one more time using a second 300-1 volume 
of chloroform. The combined chloroform layers were concentrated by rotary 
evaporation, and the residue was re-dissolved in 100 tl of acetonitrile and subjected 
to HPLC analysis. 

To determine oxygen stoichiometry, oxygen-saturated buffer was added to a 
600 ul anaerobic reaction mixture (100 mM Tris-HCl (pH 7.5) buffer, 360 uM of 
fumitremorgin B, 240 uM of iron-loaded FtmOx1, and 480 uM of a-KG). To deter- 
mine the amount of product formation under 1 x of oxygen relative to iron-loaded 
FtmOx1 concentration, the above reaction mixture was quickly mixed with 120 ul 
of oxygen-saturated buffer (1.2 mM) in the Coy chamber. To assess the amount 
of product formation under 2 of oxygen relative to iron-loaded FtmOx1 con- 
centration, the above mixture was quickly mixed with 240 ul of oxygen-saturated 
buffer (1.2 mM) in the Coy chamber. To determine the amount of production 
formation under 3 x of oxygen relative to iron-loaded FtmOx1 concentration, the 
above mixture was quickly mixed with 360 ul of oxygen-saturated buffer (1.2 mM) 
in the Coy chamber. After reaction initiation, the reaction mixtures were sealed and 
incubated for 0.5h at 37°C. The enzymatic reaction was quenched by adding 300 ul 
chloroform, the precipitated protein was removed by centrifugation at 13,000g for 
10 min, and the chloroform layer was carefully separated. The reaction mixture 
was extracted one more time using a second 300-ul volume of chloroform. The 
combined chloroform layers were concentrated by rotatory evaporation, and the 
residue was re-dissolved in 100 ul of acetonitrile and subjected to HPLC analysis. 
Reactions using Y224F-substituted FtmOx1. For the reactions using Y224F- 
substituted FtmOx1, the anaerobic reaction mixture (600 ul, in 100 mM Tris- 
HCl (pH 7.5)) contained 400 uM Y224F-substituted FtmOx1 containing 300 uM 
Fe", 400 uM fumitremorgin B, and 1,200 1M a-KG. The reaction was initiated 
by quickly adding 400 ul of oxygen-saturated buffer (1.2 mM) in the Coy cham- 
ber. The resulting reaction mixtures contained a final concentration of 240 uM 
fumitremorgin B, 240 uM Y224F-substituted FtmOx1 containing 192 uM Fe", and 
720 uM a-KG. After initiation, the reaction mixture was sealed and incubated for 
0.5h at 37°C. The enzymatic reaction was quenched by adding 300 l chloroform, 
the precipitated protein was removed by centrifugation at 13,000g for 10 min, and 
the chloroform layer was carefully separated. The reaction mixture was extracted 
once more using a second 300-ul volume of chloroform. The combined chloroform 
layers were concentrated by rotatory evaporation, and the residue was re-dissolved 
in 100 ul of acetonitrile and subjected to HPLC analysis. 

Reactions using Y224A-substituted FtmOx1. Y224A-substituted FtmOx1 was 
analysed by a procedure similar to that described in ‘Reactions using Y224F- 
substituted FtmOx1 except that Y224A-substituted FtmOx1 was used instead of 
Y224F-substituted FtmOx1. 

HPLC analysis of the FtmOx1 reaction products. Enzymatic reaction products 
were routinely analysed by HPLC using a Phenomenex reversed phase C18 column 
(250mm x 4mm, 5m; Phenomenex). A linear gradient of 30-100% (v/v) acetoni- 
trile in water was run for 30 min with a flow rate of 0.7 ml min“!, followed by 100% 
(v/v) acetonitrile for 5 min. Before the next injection, the column was equilibrated 
with 30% (v/v) acetonitrile for 2 min. The separation profile was monitored using 
a Photo Diode Array detector at 300 nm. 

Products of the reaction (compounds 2 and 3) using wild-type FtmOx1 were 
characterized by NMR and high-resolution mass spectrometry (see Supplementary 
Information). 

Isolating products from reactions using wild-type, Y224F-, or Y224A- 
substituted FtmOx1. To characterize the reaction products, a large-scale reac- 
tion (270 ml) was performed using purified FtmOx1 (41.1 1M) containing 32 uM 
Fe", fumitremorgin B (30.9 uM), and 60 uM a-KG in Tris-HCl buffer (50 mm, 
pH 7.0) at 30°C for 2h. Chloroform (300 ml) was added to the reaction mixture. 
The chloroform layer was transferred into centrifuge bottles and centrifuged at 
5,000 rpm for 10 min to remove precipitated proteins. The reaction mixture was 
extracted once more using a second 300-ml volume of chloroform. The combined 
chloroform layers were dried over Na2SO, for 0.5h and concentrated by rotary 
evaporation. The residue was subjected to HPLC separation on a C18 column 
(4.6 x 150mm). A linear gradient of 30-100% (v/v) acetonitrile in water was run 
for 25 min with a flow rate of 0.7 ml min}, followed by 100% (v/v) acetonitrile 
for 5 min. Before the next injection, the column was equilibrated with 30% (v/v) 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


acetonitrile for 2 min. The elution was monitored using Photo Diode Array detec- 
tor at 300 nm. 

Products of the reaction using Y224F- and Y224A-substituted FtmOx1 variants 

(compounds 2-6) were characterized using NMR and high-resolution mass spec- 
trometry (see Supplementary Information). 
Determining the a-KG dissociation constant. To maintain an anaerobic 
environment, all of the solutions were made anaerobic by several rounds of 
freeze-pump-thaw degassing. All spectroscopic studies used a 1-cm light path 
cuvette. After blanking against FtmOx1, spectra were recorded for samples to 
which anaerobic a-KG had been added. Titration plots were obtained by plotting 
the absorption at 520nm. The titration data were fitted to equation (6) 


Aobs = Anax [E-L] / n[FtmOx1y] (6) 


in which the observed absorption (Aops) was equal to the maximal 
absorption (Amax) multiplied by the concentration of enzyme-ligand complex 
({E-L]) divided by the concentration of ligand binding sites (the number (7) 
of ligands bound per subunit multiplied by the total concentration of FtmOx1 
containing Fe" ({FtmOx1,])). The concentration of enzyme-ligand complex was 


obtained using equation (7) 
2 
(7) 


where Ky is the apparent ligand affinity and [a-KGr] is the total a-KG concentra- 
tion. The Kg values were determined from equations (6) and (7) using nonlinear 
curve fitting (OriginPro 8 software). 

Self-hydroxylation in wild-type FtmOx1 protein. The FtmOx1 wild-type protein 
(0.6 mM) was mixed with (2.0 mM) a-KG in the anaerobic Coy chamber to form 
the pink species, and the UV-vis spectrum was recorded anaerobically using an 
S.L. Photonics CCD-440 spectrophotometer. All spectroscopic studies used a 1-cm 
light path cuvette. Upon exposing the above solution to O2, the solution slowly 
changed to a blue colour, and the process was monitored using a Cary Bio UV-vis 
spectrometer. 

Self-hydroxylation in FtmOx1(Y224F) variant. The Y224F-substituted FtmOx1 
(1.1 mM) was mixed with a-KG (4.0 mM) in the anaerobic Coy chamber to form 
the binary complex (a pink species), and the UV spectrum was monitored anaer- 
obically using an S.I. Photonics CCD-440 spectrophotometer. All spectroscopic 
studies used a 1-cm light path cuvette. Upon exposing the above solution to O2, 
the solution slowly changed to a blue colour, and the process was monitored using 
a Cary Bio UV-vis spectrometer. 

MS/MS analysis of FtmOx1. The following protein samples were analysed by 
tandem MS: (1) wild-type FtmOx1; (2) wild-type FtmOx] treated with a-KG and 
oxygen; (3) Y224F-substituted FtmOx1; (4) Y224F-substituted FtmOx1 treated 
with a-KG and oxygen; and (5) Y224F-substituted FtmOx1 after single-turnover 
experiments in the presence of fumitremorgin B, a-KG, and oxygen. These protein 
samples (~ 1.5 nmol) were dissolved in 50 mM ammonium bicarbonate (pH 8.0) 
buffer to make a 50 ul solution. Trypsin Gold (Promega US) was added to these 
solutions in a 1:50 (w/w) ratio, and the proteins were digested for 18 h at 37°C. A 
C18 Ziptip (Millipore) was then used to desalt each peptide sample. Each digested 
sample (500 fmol) was injected and analysed by liquid chromatography (LC)-MS/ 
MS on either an LTQ-Orbitrap XL mass spectrometer or a Q Exactive Plus Hybrid 
Quadrupole-Orbitrap mass spectrometer (Thermo Fisher Scientific) coupled with 
a Triversa Nanomate system (Advion Biosystems, Inc.), and a nanoACQUITY 
UPLC (Waters) with C18 reversed phase trap (2G-V/MTrap 5 um Symmetry C18 
180 «1m x 20mm) and analytical (1.7 um BEH130 C18 150 um x 100 mm) col- 
umns. Mobile phase A consisted of 98:2 water:ACN with 0.1% formic acid (FA) 
and mobile phase B contained 98:2 ACN:water with 0.1% FA. Peptide samples 
were loaded into the trap column at 2% B with a flow rate of 4.0] min“! for 4min 
and then transferred to the analytical column at 0.5 u] min '. The gradient was 
increased to 40% B over 40 min. For tandem MS analyses, data-dependent high 
resolution higher-energy collisional dissociation mass spectra were acquired in 
the Orbitrap mass analysers and Xcalibur was used for data analysis. To identify 
target peptides, the ProteinProspector program from University of California, 
San Francisco was used to predict the potential product ions and match them to 
MS/MS product ions. The mass spectra were manually examined to verify the 
assignments. 

Pre-steady-state characterization of FtmOx1. Stopped-flow experiments were 
performed on an Applied Photophysics $X20 stopped-flow spectrometer operat- 
ing in an MBraun UNilab glove box. To maintain an anaerobic environment, all 
of the solutions were prepared in an inert atmosphere box. An oxygen-saturated 


(Kg +[a—KGy]+n[FtmOx1y]) 
[E-L]= 


tk +[a—KGy]+ n{FtmOx1;])* —4Ala— KGy]n[FtmOx1y| 


buffer solution (100 mM Tris-HCl (pH 7.5)) was mixed with an equal volume of an 
oxygen-free solution containing FtmOx1 (0.65mM), Fe" (0.58mM), a-KG 
(12 mM), substrate (0.58 mM), and 20% glycerol to initiate the reaction. 
Absorbance scans from 300 to 700 nm were collected with a diode-array detector 
at 8°C. The resulting data were processed using SigmaPlot software. 

Freeze-quench experiments were performed using a KinTek quench-flow 
instrument. Analogous to the stopped-flow experiments, an oxygen-saturated 
buffer solution (100 mM Tris-HCl (pH 7.5)) was mixed with an equal volume of 
an oxygen-free solution containing FtmOx1 (0.65 mM), Fe" (0.58mM), a-KG 
(12mM), substrate (0.58 mM), and 20% glycerol to initiate the reaction at 8 °C. 
The resulting reaction was terminated by injection of the solution into liquid 
ethane (-90°C) at various time points. The reaction time of a freeze-quenched 
sample is the sum of the ageing time and the quench time. The ageing time was 
the transit time for the reaction mixture through the ageing hose. The quench 
time corresponded to the time required after injection into the cryosolvent for 
the reaction mixture to be cooled sufficiently to prevent further reaction and was 
estimated as ~5 ms (ref. 63). 

The chemical-quench-flow experiments were performed using a KinTek 
quench-flow instrument. Analogous to the freeze-quench experiments, an 
oxygen-saturated buffer solution (100 mM Tris-HCl (pH 7.5)) was mixed with 
an equal volume of an oxygen-free solution containing FtmOx1 (0.65 mM), Fe" 
(0.58 mM), a-KG (12 mM), substrate (0.58 mM), and 20% glycerol to initiate 
the reaction at 8°C. The resulting reaction was terminated by injecting the 
solution into a microcentrifuge tube containing 4x volumes of acetone at the 
desired reaction times. Before HPLC analysis, the samples were centrifuged to 
remove protein, and the supernatant was concentrated by rotatory evaporation. 
The concentrated samples were subjected to HPLC separation on a C18 column 
(4.6 x 100mm). A linear gradient of 30-100% (v/v) acetonitrile in water was run 
for 25 min with a flow rate of 0.5 ml min“, followed by 100% (v/v) acetonitrile 
for 3 min. Before the next injection, the column was equilibrated with 30% (v/v) 
acetonitrile for 2 min. The substances were detected with a Photo Diode Array 
detector at 300 nm. 

X-band (9.64 GHz) EPR spectra were recorded on a Bruker E500A spectrom- 
eter equipped with an Oxford ESR 910 cryostat for low-temperature measure- 
ments. The microwave frequency was calibrated with a frequency counter, and 
the magnetic field was calibrated with an NMR gaussmeter. The temperature 
of the X-band cryostat was calibrated with a carbon-glass resistor temperature 
probe (CGR-1-1000 LakeShore Cryotronics). For all EPR spectra, a modulation 
frequency and amplitude of 100 kHz and 1 mT were used. The EPR spectral sim- 
ulations were performed using the simulation software Spin Count developed 
by one of the authors. 


30. Minor, W. & Otwinowski, Z. in Methods in Enzymology, Macromolecular 
Crystallography (Academic Press, 1997). 

31. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for 
macromolecular structure solution. Acta Crystallogr. D 66, 213-221 (2010). 

32. Terwilliger, T. C. Maximum-likelihood density modification. Acta Crystallogr. D 
56, 965-972 (2000). 

33. Terwilliger, T. C. et al. Decision-making in structure solution using Bayesian 
estimates of map quality: the PHENIX AutoSol wizard. Acta Crystallogr. D 65, 
582-601 (2009). 

34. Baker, D., Bystroff, C., Fletterick, R. J. & Agard, D. A. PRISM: topologically 
constrained phased refinement for macromolecular crystallography. Acta 
Crystallogr. D 49, 429-439 (1993). 

35. Bricogne, G. Geometric sources of redundancy in intensity data and their use 
for phase determination. Acta Crystallogr. A 30, 395-405 (1974). 

36. Bringer, A. T. Free R value: a novel statistical quantity for assessing the 
accuracy of crystal structures. Nature 355, 472-475 (1992). 

37. Cowtan, K. Error estimation and bias correction in phase-improvement 
calculations. Acta Crystallogr. D 55, 1555-1567 (1999). 

38. Cowtan, K. D. & Main, P. Improvement of macromolecular electron-density 
maps by the simultaneous application of real and reciprocal space constraints. 
Acta Crystallogr. D 49, 148-157 (1993). 

39. Cowtan, K. D. & Main, P. Phase combination and cross validation in iterated 
density-modification calculations. Acta Crystallogr. D 52, 43-48 (1996). 

40. Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta 
Crystallogr. D 67, 235-242 (2011). 

41. Sayre, D. Least-squares phase refinement. Il. High-resolution phasing of a 
small protein. Acta Crystallogr. A 30, 180-184 (1974). 

42. Schuller, D. J. MAGICSQUASH: more versatile non-crystallographic averaging 
with mulitple constraints. Acta Crystallogr. D 52, 425-434 (1996). 

43. Swanson, S. M. Core tracing: depicting connections between features in 
electron density. Acta Crystallogr. D 50, 695-708 (1994). 

44. Wang, B. C. Resolution of phase ambiguity in macromolecular crystallography. 
Methods Enzymol. 115, 90-112 (1985). 

45. Zhang, K. & Main, P. The use of Sayre’s equation with solvent flattening and 
histogram matching for phase extension and refinement of protein structures. 
Acta Crystallogr. A 46, 377-381 (1990). 


© 2015 Macmillan Publishers Limited. All rights reserved 


46. 


47. 


48. 


49. 


50. 


51. 


52. 


53: 


54. 


Cowtan, K. The Buccaneer software for automated model building. 1. Tracing 
protein chains. Acta Crystallogr. D 62, 1002-1011 (2006). 

Cowtan, K. Fitting molecular fragments into electron density. Acta Crystallogr. D 
64, 83-89 (2008). 

Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of 
Coot. Acta Crystallogr. D 66, 486-501 (2010). 

Afonine, P. V. et a/. Towards automated crystallographic structure refinement 
with phenix.refine. Acta Crystallogr. D 68, 352-367 (2012). 

Berkholz, D. S., Shapovalov, M. V., Dunbrack, R. L., Jr & Karplus, P. A. 
Conformation dependence of backbone geometry in proteins. Structure 17, 
1316-1325 (2009). 

Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular 
crystallography. Acta Crystallogr. D 66, 12-21 (2010). 

Headd, J. J. et al. Use of knowledge-based restraints in phenix.refine to improve 
macromolecular refinement at low resolution. Acta Crystallogr. D 68, 381-390 
(2012). 

Moriarty, N. W., Grosse-Kunstleve, R. W. & Adams, P. D. electronic Ligand 
Builder and Optimization Workbench (eLBOW): a tool for ligand coordinate 
and restraint generation. Acta Crystallogr. D 65, 1074-1080 (2009). 

Tronrud, D. E., Berkholz, D. S. & Karplus, P. A. Using a conformation-dependent 
stereochemical library improves crystallographic refinement of proteins. Acta 
Crystallogr. D 66, 834-842 (2010). 


62. 


63. 


LETTER 


. Urzhumtseva, L., Afonine, P. V., Adams, P. D. & Urzhumtsev, A. Crystallographic 


model quality at a glance. Acta Crystallogr. D 65, 297-300 (2009). 


. McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 


658-674 (2007). 


. McCoy, A. J., Grosse-Kunstleve, R. W., Storoni, L. C. & Read, R. J. Likelihood- 


enhanced fast translation functions. Acta Crystallogr. D 61, 458-464 (2005). 


. Read, R. J. Pushing the boundaries of molecular replacement with maximum 


likelihood. Acta Crystallogr. D 57, 1373-1382 (2001). 


. Storoni, L.C., McCoy, A. J. & Read, R. J. Likelihood-enhanced fast rotation 


functions. Acta Crystallogr. D 60, 432-438 (2004). 
Schrodinger, LLC. The PyMOL Molecular Graphics System, Version 1.3r1 
(2010). 


. Helm, |., Jalukse, L., Vilbaste, M. & Leito, I. Micro-Winkler titration method for 


dissolved oxygen concentration measurement. Anal. Chim. Acta 648, 167-173 
(2009). 

Ryle, M. J., Padmakumar, R. & Hausinger, R. P. Stopped-flow kinetic analysis 

of Escherichia coli taurine/a-ketoglutarate dioxygenase: interactions with 
a-ketoglutarate, taurine, and oxygen. Biochemistry 38, 15278-15286 

(1999). 

Baldwin, J. et al. Mechanism of rapid electron transfer during oxygen activation 
in the R2 subunit of Escherichia coli ribonucleotide reductase. 1. Evidence for a 
transient tryptophan radical. J. Am. Chem. Soc. 122, 12195-12206 (2000). 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Absorbance 
Absorbance 


Absorbance 


0.00 T T . T 


° 
s 


0 1 2 3 2 5 °o 4a 2 
[a-KG] (mM) 


Extended Data Figure 1 | Characterization of FtmOx1-a-KG complex. 

a, Wild-type FtmOx1 and a-KG binding curve. The increase in absorbance at 
520nm asa function of a-KG concentration when it was added to a solution 
of wild-type FtmOx1 (0.9mM) and Fe" (0.72 mM) is plotted. On the basis of 
the equations described in the Methods (determining the a-KG dissociation 
constant), the Ky for wild-type FtmOx1 and a-KG is ~ 185+ 35 uM. 

b, Y224F-substituted FtmOx1 and a-KG binding curve. The increase of 
absorbance at 520 nm as a function of a-KG concentration when it was added 


[o-KG] (mM) 


0.00 . T T T T : 
3 4 5 0 1 2 3 4 5 


[a-KG] (mM) 


to a solution of Y224F-substituted FtmOx1 (0.9mM) and Fe" (0.7 mM) 

is plotted. Ky for Y224F-substituted FtmOx1 and a-KG is ~198 +58 uM. 

c, Y224A-substituted FtmOx1 and a-KG binding curve. The increase of 
absorbance at 520 nm as a function of a-KG concentration when it was added 
to a solution of Y224A-substituted FtmOx1 (0.7 mM) and Fe" (0.51 mM) is 
plotted. Ky for Y224A-substituted FtmOx1 and a-KG is 204 + 43 uM. In a-c, 
the Kg was calculated based on the concentration of iron-loaded FtmOx1. 
The experiments were replicated three times and error bars represent s.e.m. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


o 
oO 
c 
© 
Q 
— 
° 
” 
< 
400 500 600 700 800 
Wavelength (nm) 
Extended Data Figure 2 | Suppression of DOPA formation by the complex to O, when the substrate fumitremorgin B is present. Spectra were 
presence of substrate fumitremorgin B. There is no immediate evidence recorded after FtmOx1 was used as the control to blank the UV-visible 
for the formation of DOPA upon the exposure of the FtmOx1l-a-KG absorption reading. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


FtmOx1:sub = 2:1 


FtmOx1:sub = 1:1 


FtmOx1:sub = 0.5:1 


Intensity (300 nm) 


FtmOx1:sub = 0.25:1 


without o-KG 


16 18 20 22 24 26 28 30 
Time (min) 


Extended Data Figure 3 | HPLC chromatograms of the FtmOx1 reaction 
enzyme-concentration dependence. Chromatograms of FtmOx! reactions 
with increasing amounts of FtmOx!] relative to the amount of substrate. 
The reaction mixture contained 100 mM Tris-HCl, (pH 7.5), 180 uM 
fumitremorgin B, 2mM a-ketoglutarate, and variable amounts of FmOx1. 
Identities of the peaks were assigned based on subsequent NMR and MS 
characterizations of the isolated compounds. This experiment indicates that 
FtmOxt] is capable of catalysing endoperoxides formation in the absence of 
any other reductants. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 6 b & 
& 12 5 
uw us 
3 3 1.0 
& 1.0 & 
= = os 
© os ig 
= = 
2 8 
3 % 06 
= 06 = 
— - 
2 & 
g # 
3 04 Q oA 
3 3 
Py . 0.2 
S 8 
B 0.0 2 0.0 
3 00 0.8 1.0 1.5 2.0 3 00 0.5 1.0 15 2.0 2.5 3.0 3.5 
m a-KG/Fe(Il)-loaded FtmOx1 ut Oxygen/Fe(Il)-loaded FtmOx1 
Extended Data Figure 4 | Stoichiometry determination for a-KG and O, based on the fumitremorgin B (1), compound 2, and compound 3 internal 
in FtmOx1 reaction. a, b, Equivalents of endoperoxide products (2 and 3) standards. All calculations were based on the concentration of iron-loaded 
produced as a function of the ratio of a-KG to iron-loaded FtmOx!1 (a) FtmOx1. The experiments were replicated three times and error bars 
and oxygen to iron-loaded FtmOx1 (b). The quantification was conducted represent s.d. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


NZ 


His129 


fumitremorgin B 


His205 


a-ketoglutarate 


A 
Extended Data Figure 5 | Structural comparison of the active site 
topologies between FtmOx1 and TauD. a, Examination of the alternative 
configuration of a-KG in the FtmOx1l-a-KG binary complex using the 
configuration of a-KG in the TauD-a-KG binary complex. We modelled 
a-KG in this alternative binding mode and calculated the difference map. 
In the F, — F, map, strong positive density (green) and negative density (red) 
are shown even when contoured to high level (3.30), indicating that this 
configuration is not correct for the FtmOx1-a-KG complex. b, The F,— F. 
map at the active site of the FtmOx1-fumitremorgin-B complex. A model 


His255 Taurine 


1-carboxylate 


a-ketoglutarate 


of the substrate fumitremorgin B is superimposed onto the difference map, 
which is contoured at 2.80. c, Side-by-side comparison of FtmOx1 and TauD 
active-site topologies. In the left panel, the superimposition of the binary 
structures of FtmOx1-a-KG and FtmOx1-fumitremorgin-B (1) show that 
the remaining site for oxygen binding and activation is blocked from the 
substrate by Y224. d, In contrast, in the structure of the TauD-taurine-a-KG 
tertiary complex, the remaining site for O2 binding and activation directly 
faces the substrate (taurine). 


© 2015 Macmillan Publishers Limited. All rights reserved 


Absorbance 


Q 
oO 


b,?* 


[b,-H,O]** 


Relative Abundance 
P=} a 


wo 
So 


200 400 600 1000 1200 


von) onal rere, 


dF | | *Oxidation of Phenylalanine 


m/z 


200 400 600 800 1000 1200 1400 


Extended Data Figure 6 | Characterization of FumOx1 Y224F variant. 

a, Self-hydroxylation reaction in Y224F-substituted FtmOx1. Formation of 
DOPA upon exposure of the Y224F-substituted FtmOxl-a-KG complex 
to O2. b-e, MS/MS analyses of Y224F-substituted FtmOx1. b, MS/MS 
spectrum of the triply charged parent ion at m/z 768.4109 of a tryptic 
digested peptide (residue 219-237) from wild-type FtmOx1. c, MS/MS 
spectrum of the triply charged parent ion at m/z 763.0793 of a tryptic 


LETTER 


7 y 


saiOnPENEVEP RENE 


1200 


400 600 800 1000 


Ye? 77 
oni hihi 


°Di-oxidation of Phenylalanine 


[b,° -H,0] 2+ 


240 


1000 1400 m/z 


1200 


200 400 600 800 


digested peptide (residue 219-237) from Y224F-substituted FtmOx1. 

d, MS/MS spectrum of the triply charged parent ion at m/z 768.4109 

of a tryptic digested peptide (residue 219-237) after exposure 
Y224F(FtmOx1)-a-KG tertiary complex to O2. e, MS/MS spectrum of the 
triply charged parent ion at m/z 773.7426 of a tryptic digested peptide (residue 
219-237) for DOPA formed upon exposure of FtmOx(Y224F)-a-KG complex 
to O, in the absence substrate fumitremorgin B. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


CH30 
pathway I' CH30 
Oz 
eo ge 
F x @ 
xO COy OH 
* 205, Mae Hoos ~,. Coz é 
a he de aad co 6 CO; 
D431 | Ox Pa &e Vn pes iad 
H20 Hi2g Cy D431 | Oe ao oO 
Hoos, | OH So H Cc Di31 | Ox 
,, 2 128 So thie SC 
Fe2+ B' ' 129 NS 
Cc O 
Di31 | Now D' 
H429 Oo sy pathway II' | 
N 


verruculogen (2) OH 
Hao, | | O Coy iW 
ite Neat 


~ 
| o © fe) \ 


Hi29 


So 
es 


Extended Data Figure 7 | Mechanistic model for the production of dealkylation products in FtmOx1 Y224A or Y224F variants. 


© 2015 Macmillan Publishers Limited. All rights reserved 


0s 


14 16 18 20 
Time (min) 


Extended Data Figure 8 | Pre-steady-state analyses of FtmOx1 reactions. 
a, HPLC chromatograms for FtmOx1 reactions chemically quenched at the 
indicated times. The reaction mixture in 100 mM Tris-HCl (pH 7.5) buffer 
contained FtmOx]1 (0.65 mM), Fe" (0.58mM), a-KG (12 mM), substrate 
(0.58 mM), and 20% glycerol. The mixture was mixed with O2-saturated 
buffer to initiate the reaction. There is an extra signal next to compound 3, 
which might be due to other chemicals released during the quench process. 
Results from the chemical quench experiment indicate that FtmOx1 catalysis 
is on the timescale of a few seconds per cycle. b, Time-dependent 420 nm 


LETTER 


b 

0.4 0.4 
TT 
ad 
tt) 
ae 
10s s 0.3 0.3 6 
< > 
g 5 
v = 
¢ 0.2 0.2 » 
ys) 
() 
tT) 
0.1 0.1 o 
> 
or 


0 ke 0 
0.004 0.01 0.1 1 10 


Time (s) 


absorption change (black solid curve) determined by stopped-flow optical 
absorption spectroscopy and the concentrations of the high-spin Fe** species 
(blue squares) and the g=2 species (red dots) determined in the rapid-freeze- 
quench EPR experiments. The black solid curve is associated with the left 

y axis and is from the average of two stopped-flow trials. The blue squares 
and red dots are associated with the right y axis and are from the average of 
two rapid-freeze-quench EPR experiments. The experiments were repeated 
twice, and error bars reflect the uncertainty of the packing factor of rapid- 
freeze-quench EPR samples, which is around +10%. 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


4.54, 
0.015 
.26 S 
a= 
1) 
0 100 200 300 400 100 120 140 


Magnetic Field (mT) 


Extended Data Figure 9 | EPR spectroscopic analyses of FtmOx1 
reactions. a, X-band EPR spectra measured at 19 K in reaction samples 
prepared at the indicated times. The black line shows the sample containing 
the FtmOx1-Fe"-a-KG complex in the absence of Oo. (There is a very small 
signal at g~ 4.3 region, only accounted for by <5 uM iron in the sample, 
which might be due to a very small amount of Fe** from inactive enzyme.) 
Bottom, the reaction sample freeze-quenched at ~0.2s after mixing the 
FtmOx1-Fe"—-a-KG complex with O). It has two signals: an Fe** (g= 4.54, 
4.26, and 3.93) and a radical signal at the g= 2 region. b, X-band EPR spectra 


Magnetic Field (mT) 


160 180 200 220 


100 


120 140 160 180 200 220 
Magnetic Field (mT) 


measured at 19 K for samples freeze-quenched at the indicated times showing 
the formation of high-spin ferric species on the time scale within 1s. The 
reaction was initiated by mixing the FtmOx1-Fe"-a-KG complex with Op. 
g-values are indicated in the figure. c, X-band EPR spectra measured at 19K 
for samples freeze-quenched at 0.05 s and the spectral simulation for an 
S=5/2 high-spin ferric species. The simulation parameters are: D=0.3 cm! 
E/D=0.266, o(E/D) = 0.03, and g= 4.54, 4.26, 3.93. Measurement conditions 
in a—c: microwave frequency, 9.64 GHz; microwave power, 0.2 mW; 
modulation amplitude, 1 mT; and modulation frequency, 100 kHz. 


> 


© 2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | X-ray crystallography data collection and refinement statistics 


FtmOx1 FtmOx1ea-KG FtmOx1¢fumitre 
Complex morgin B 
Leen ae 
Data collection Set-Met Native 
Space group P12,1 Pi 2y! Piz! | gal | 
Cell dimensions 
a, b, c (A) 60.6, 45.8, 105.2 60.4, 45.6, 105.4 60.6, 45.8,105.4 60.3, 45.4, 104.8 
a, B, v (°) 90.0, 100.5,90.0  90.0,99.7,90.0 90.0, 100.0,90.0 90.0, 100.3, 90.0 
Resolution (A) 48.23 - 3.49 42.91 - 1.95 36.13 - 2.54 36.05 - 2.11 
(3.63 - 3.51) * (2.02 - 1.95) (2.63 - 2.54) (2.19 - 2.11) 
-_ 0.114 (0.157) 0.101 (0.725) 0.125 (0.691) 0.077 (0.338) 
I/ol 13.19 (7.64) 18.33 (2.02) 10.53 (1.50) 17.08 (2.80) 
Completeness (%) 99.90 (100.00) 99.90 (99.22) 99.93 (99.35) 99.86 (98.63) 
Redundancy 6.1 (3.2) 6.6 (5.7) 3.7 (3.7) 3.7 (3.1) 
Refinement 
Resolution (A) 42.91 - 1.95 36.13 - 2.54 36.05 - 2.11 
(2.02 - 1.95) (2.63 - 2.54) (2.19 - 2.11) 
No. reflections 41711 18951 32435 
Resi Res 0.1643/0.2043 0.1756/0.2340 0.1670/0.2033 
No. atoms 4906 4704 4884 
Protein 4535 4535 4535 
Ligand/ion 25 23 38 
Water 346 146 311 
B-factors (A) 29.5 35.8 32.4 
Protein 29.2 35.8 31.9 
Ligand/ion 47.5 41.0 49.6 
Water 32.0 35.3 ei ee 
R.m.s deviations 
Bond lengths (A) 0.005 0.017 0.004 
Bond angles (°) 0.97 0.93 0.87 


«Highest resolution shell is shown in parenthesis. 


© 2015 Macmillan Publishers Limited. All rights reserved 


CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature15533 


Corrigendum: A basal 
ichthyosauriform with a short 
snout from the Lower Triassic 
of China 


Ryosuke Motani, Da- Yong Jiang, Guan-Bao Chen, 
Andrea Tintori, Olivier Rieppel, Cheng Ji & Jian- Dong Huang 


Nature 517, 485-488 (2015); doi:10.1038/nature13866 


The data matrix in the original Supplementary Data 3 of this Letter 
reproduced the tree topology shown in Extended Data Fig. 3 but the 
accompanying character descriptions did not match the coding given 
in the data matrix. (The numbering of character states was shifted by 1 
because of a typo that occurred while editing the list in a spreadsheet, 
and the character state 3, which was erroneously numbered 4, was 
accidentally omitted from the list.) The Supplementary Information 
accompanying this Corrigendum contains the revised Supplementary 
Data 3. The revised Supplementary Data 3 also reproduces the tree 
topology shown in Extended Data Fig. 3 of the original Letter; note 
that the characters have been reordered in the revised Supplementary 
Data 3 for anatomical consistency. 

In addition, the tree statistics originally published in the Extended 
Data Fig. 3 legend were wrong because they were derived from a matrix 
where Parvinatator was removed from the original matrix. The correct 
statistics reflecting all 56 taxa in the original matrix are 243 equally 
most parsimonious trees of TL= 529, CI=0.423 and RI=0.796. 


Supplementary Information is available in the online version of this Corrigendum. 


544 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature15737 


Corrigendum: Influence 
maximization in complex networks 
through optimal percolation 


Flaviano Morone & Hernan A. Makse 


Nature 524, 65-68 (2015); doi:10.1038/nature14604 


In the Acknowledgements section of this Letter, ‘ARL should read 
‘Army Research Laboratory Cooperative Agreement Number 
W911NF-09-2-0053 (the ARL Network Science CTA)’ This has been 
corrected in the online versions of the paper. 


544 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


CORRECTIONS & AMENDMENTS 


RETRACTION 
doi:10.1038/nature15745 


Retraction: Non-blinking 


semiconductor nanocrystals 


Xiaoyong Wang, Xiaofan Ren, Keith Kahen, Megan A. Hahn, 
Manju Rajeswaran, Sara Maccagnano- Zacher, John Silcox, 
George E. Cragg, Alexander L. Efros & Todd D. Krauss 


Nature 459, 686-689 (2009); doi:10.1038/nature08072 


In this Letter, we reported the unusual non-blinking characteristics of 
the fluorescence from individual CdZnSe/ZnSe alloyed quantum dots. 
However, it has recently come to our attention that similar fluorescence 
behaviour was seen by Celso de Mello Donega, Daniel Vanmaekelbergh 
and co-workers from a single fluorophore on bare silica glass. In par- 
ticular, individual fluorescence spots from single molecules were 
found to be non-blinking, and fluorescence spectra looked similar 
to what we reported in our Letter. We corroborated their findings 
by conducting experiments of our own on bare quartz coverslips, and 
on quartz coverslips coated with polymethyl methacrylate (PMMA). 
Although these same control experiments were performed by us before 
publication, this time we clearly observed non-blinking fluorescence 
from isolated spots on the coverslip. Furthermore, the fluorescence 
spectra from these spots were in all practical respects identical to 
what we reported in our Letter. Subsequent investigations by us have 
revealed that the surprising origins of the unusual fluorescence come 
from individual, molecular defects in silica glasses, brightened by 
the polymer coating. The details of these new findings will be the 
subject of future publications’. After examining the data of de Mello 
Donega and colleagues, and determining that we were both observ- 
ing the same phenomena, we concluded that we cannot attribute the 
fluorescence we observed to CdZnSe/ZnSe quantum dots. In view 
of these new results, we therefore wish to retract the paper and sin- 
cerely apologize for our error. All authors agree with the decision to 
retract the paper with the exception of X.R., who was unable to be 
contacted. 


1. Rabouw, F. et al. Non-blinking single-photon emitters in silica. Preprint at: 
http://arXiv.org/abs/1509.07262 (2015). 


544 | NATURE | VOL 527 | 26 NOVEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


SCIENCE PICTURE CO./SPL 


ANTIBODY ANARCHY: 
ACALL T0 ORDER 


Antibodies used in research often give murky results. 
Broader awareness and advanced technologies promise clarity. 


bed 


Antibodies, with their distinctive Y-shape, are among the most widely used — and most vexing — reagents in biology. 


BY MONYA BAKER 


mouse first alerted Clifford Saper 
Ae the fact that antibodies were mis- 

leading the scientific community. As 
editor-in-chief of the Journal of Comparative 
Neurology between 1994 and 2011, he handled 
scores of papers in which scientists relied on 
antibodies to flag the locations of neurotrans- 
mitters and their receptors. Around the turn 
of the century, related investigations began to 


roll in from researchers using knockout mice, 
animals genetically engineered to not express 
a target gene. The results were unsettling. Anti- 
body staining in knockout animals should 
have shown radically different patterns from 
those in unmodified animals. But all too often 
the images were identical. “As we saw more and 
more retractions due to this, I began to real- 
ize that we had no systematic way to evaluate 
papers that used antibodies,” recalls Saper, now 
chair of neurology at Beth Israel Deaconess 


Medical Center in Boston, Massachusetts. 

Thus began a one-journal revolution. Saper 
and his editorial colleagues set up a policy 
of requiring extensive validation data on each 
antibody’. The policy was good for rigour, but 
not submissions, he recalls. “Many authors 
were caught in the middle, and found it easier 
to publish their papers elsewhere.” But Saper 
persisted. His efforts eventually culminated 
in the JCN Antibody Database, an inventory 
of a few thousand antibodies that can be 


26 NOVEMBER 2015 | VOL 527 | NATURE | 545 


© 2015 Macmillan Publishers Limited. All rights reserved 


> trusted for neuroanatomy. 

Today, biomedical researchers still collect 
tales of antibody woe faster than country-music 
labels spin out sad songs. The most common 
grumble is the cheating reagent: the antibody 
purchased to detect protein X surreptitiously 
binds protein Y (and perhaps ignores X alto- 
gether). Another complaint is ‘lost treasure’: a 
run of promising experiments that stalls when a 
new batch of antibodies fails to reproduce pre- 
vious findings (see ‘A market ina bind’). 

But technological advances and shifts in 
the scientific community now promise to cut 
through this antibody quagmire. 

Antibodies are ubiquitous tools in the life 
sciences. Perhaps their most popular use is 
in western blotting to reveal the presence of a 
particular protein in cells or tissue samples, but 
they are also used to visualize proteins under 


A market ina bind 


An antibody that performs differently across 
experiments can cause calamity. But the 
performance of these reagents is linked to 
how they are manufactured. 

Polyclonal antibodies are made by 
collecting the blood of an animal immunized 
with the target antigen. Any particular lot 
will therefore only be available as long as 
the animal lives. To produce monoclonal 
antibodies, a host animal is immunized 
with the target protein or relevant portion 
of it, then the B lymphocytes that recognize 
and respond to that antigen are fused to 
a myeloma cell line that can be cultured 
indefinitely to produce the desired antibody. 

Recombinant antibodies are unlike 
traditional monoclonals because they 
can be manufactured without animals. 
Instead, these antibodies are made by 
identifying an exact gene sequence for 
an antibody — either by sequencing an 
animal’s immune cells to find those that 
produce antibodies with highest affinity 
for the target, or sequentially shuffling 
gene sequences and testing the resultant 
proteins. That gene can then be introduced 
into an appropriate cell line to produce 
antibodies. Because the identity of the 
antibody is precisely defined, the cell line 
can be regenerated if the original colony 
dies or mutates. 

The pursuit of antibody quality has 
inspired two publicly funded initiatives 
aimed at generating collections of validated 
antibodies and other protein-binding 
reagents. These produced thousands 
of new binders, but the Protein Capture 
Reagents programme, which launched 
in 2010, is already winding down, as is 
the European Union-funded Affinomics 
consortium, which launched in 2007 (ref. 8). 


the microscope by immunohistochemistry 
and immunofluorescence, as well as in many 
other applications that stem from an antibody’s 
presumed ability to bind specific biomolecules. 
A 2015 report from online purchasing portal 
Biocompare puts the market for research anti- 
bodies at US$2.5 billion a year and growing. 
The choice is dazzling: there are hundreds of 
vendors supplying products. 

It is alarming, then, to discover that anti- 
bodies can be unreliable reagents. Insufficient 
specificity, sensitivity and lot-to-lot consist- 
ency have resulted in false findings and wasted 
efforts. Antibody unreliability has taken its toll 
across studies in cancer, metabolism, ageing, 
immunology and cell signalling, and in any 
field concerned with researching complex 
biomolecules. The waste, in terms of time and 
resources, is colossal. Losses from purchasing 


Advocates say that the chosen targets, such 
as transcription factors, were particularly 
problematic and that further investments in 
such reagents would yield larger pay-offs. 

Meanwhile, polyclonals command a 
large swathe of the market. A project that 
profiled reagents used across 10,000 
biomedical papers published since 2006 
found references to 1,293 polyclonals, 

755 monoclonals and only 1 recombinant. 
Some researchers think that polyclonal 
antibodies, which can target a protein 

in multiple ways, are not only easy to 
manufacture but also particularly good at 
recognizing proteins in diverse contexts. 

Eric McIntush is chief scientific officer of 
Bethyl Laboratories in Montgomery, Texas, 
which has been selling polyclonal antibodies 
for over 40 years and plans to start selling 
recombinants in 2016. The research world 
needs both, he says. Companies simply 
cannot afford to sink funds into products 
that they may never sell. The widespread 
availability of polyclonals, which are 
currently the least expensive antibody to 
develop, may encourage experiments on 
under-investigated proteins. As targets 
become more defined and are needed for 
translational applications, he says, there will 
be a market for recombinant products. 

But researchers such as Andreas 
Plickthun, a protein engineer at the 
University of Zurich in Switzerland, think 
that polyclonals and monoclonals should 
be eliminated entirely in favour of defined 
binders. He agrees that many proteins are 
not addressed by existing reagents but 
does not see the point in making undefined 
products such as polyclonals. “Why not 
use something where the genes can be 
identified or kept?” he asks. WV/.8. 


ANTIBODIES 


poorly characterized antibodies have been esti- 
mated at $800 million per year, not counting 
the impact of false conclusions, uninterpret- 
able (or misinterpreted) experiments, wasted 
patient samples and fruitless research time’. 
Mathias Uhlén, a protein researcher at the 
Royal Institute of Technology in Stockholm, 
says that frustration with research antibodies 
has been building for years’ and that the time 
is finally ripe for improvements. “There is a 
big interest in the community to clean this up” 


SPURRED TO ACT 

Discontent has spurred action along various 
fronts. In September, Uhlén chaired the 
inaugural meeting for a working group on 
antibody validation hosted by the Human 
Proteome Organization, an international 
consortium based in Vancouver, Canada, that 
supports large-scale projects for understand- 
ing proteins. That same month, the Federation 
of American Societies for Experimental Biol- 
ogy hosted roundtables to explore problems 
with antibodies. It expects to issue recom- 
mendations early next year. The US National 
Institutes of Health (NIH) is also on the case. 
Starting in January next year, grant applica- 
tions must include a new section describing 
efforts to authenticate antibodies and other 
key resources required for experiments. Far- 
reaching solutions are likely to be hammered 
out at a meeting hosted by the Global Biologi- 
cal Standards Institute next September. The 
gathering will be held in Asilomar, California, 
where scientists gathered 40 years ago to set 
cautionary approaches for using recombinant 
genetic technology to manipulate DNA. 

“We're hoping that the community will come 
up with consensus guidelines,’ says Jon Lorsch, 
director of the US National Institute of General 
Medical Sciences in Bethesda, Maryland. That 
way, both grant applicants and reviewers will 
have resources to turn to when describing how 
they will authenticate their materials. 

Such resources could take the form ofa menu 
of broad-strokes criteria. “We are not talking 
about good and bad antibodies but antibodies 
that work in specific assays and specific con- 
text,” says Uhlén. Evaluation categories might 
include knockdown and knockout approaches 
to reveal whether an antibody still binds even 
in the absence of the target protein. Another 
approach would be to tag a target protein with 
a fluorescent marker to reveal whether the 
antibody also binds untagged proteins. A third 
category could compare a new antibody with 
a well-characterized one. Finally, researchers 
could run the antibody and whatever it binds 
through a mass spectrometer to analyse bound 
molecules for the expected protein fragments. 

Several vendors have announced their own 
characterization efforts, and new technologies 
are helping. Alan Hirzel, chief executive officer 
of Abcam, a life-sciences reagents provider in 
Cambridge, UK, says that to verify that its com- 
mercial antibodies perform as expected, the 


26 NOVEMBER 2015 | VOL 527 | NATURE | 547 


© 2015 Macmillan Publishers Limited. All rights reserved 


REPRODUCED WITH PERMISSION OF OXFORD UNIVERSITY PRESS 


ANTIBODIES 


Pairs of antibodies can be designed to signal (red) only when both detect the same target protein’. 


company is using a genome editing method 
called CRISPR-Cas9, which makes precise 
changes in DNA. The company is testing anti- 
bodies on human cell lines in which target 
genes have been disrupted by CRISPR-Cas9 
and then posting results for each reagent tested. 

“We now really have the technologies we 
need that allow us to carry out those characteri- 
zations, whereas 5 or 10 years ago, we simply 
didn't? says Klaus Lindpaintner, chief scientific 
officer at Thermo Fisher Scientific, a life-sci- 
ences tools provider in Waltham, Massachu- 
setts. Those companies with characterization 
data are starting to view this as a competitive 
advantage. In June this year, life-sciences com- 
pany Bio-Rad in Hercules, California, launched 
a line of antibodies that have been tested for 
off-target activity in western blots against 12 
different cell lines. 


Since mid-2014, Pro- “Providers 
teintech, anantibody cannot 
manufacturer inChi- guarantee 
cago, Illinois,hasbeen that agiven 
using smallinterfer- antibody will 
ing RNA to knock work for every 


down gene expres- 
sion in each new anti- 
body product — assessing whether the signal 
subsides with the expression of the target gene. 
Such efforts are nascent, however, with only a 
tiny fraction of companies’ catalogues being 
subjected to validation. 

And not all companies disclose the specific 
conditions of testing, or whether an antibody 
has performed poorly under those conditions, 
says Gordon Whiteley, lab director at the NIH’s 
Antibody Characterization Program, which 
aims to create reliable antibodies for use in 
cancer biology. The example his programme 
sets in terms of supplying testing protocols and 
resulting data could be just as important as the 
reagents themselves, he says. 

There will be no single best way to test 


tissue type.” 


antibodies, says Roberto Polakiewicz, chief sci- 
entific officer of Cell Signaling Technology, an 
antibody manufacturer in Danvers, Massachu- 
setts. “Developing an antibody is a scientific 
endeavour. You need people who know what 
experiments to do to validate an antibody.” If 
customers cannot see the data and make their 
own judgements, they need to look for a new 
antibody, he says. 

But researchers sometimes take only a 
cursory look at data, and many do not realize 
that antibodies’ performance in a given tissue 
or application, such as western blotting, says 
little about whether it will work in other sorts 
of experiments. 

And commercial providers cannot guarantee 
that a given antibody will work for every tissue 
type and experimental condition, warns Paul 
Sawchenko, a neuroscientist at the Salk Institute 
in San Diego, California. “Unless one is so for- 
tunate as to have had someone else demonstrate 
specificity in the same tissue from the same spe- 
cies under the same experimental conditions, 
you should be obliged to do this yourself” 


VITAL INFORMATION 

It would be more efficient to learn from other 
researchers’ work, but fewer than half of the 
publications that describe antibody experi- 
ments report which specific reagent was actu- 
ally used*. Even when authors do include a 
catalogue number, companies may discontinue 
products and sell off lines, making them hard 
to track, says Anita Bandrowski, an informa- 
tion scientist at the University of California, 
San Diego. Bandrowski is group leader at the 
Resource Identification Initiative, an NIH- 
backed programme involving a diverse group of 
academic collaborators. The initiative has been 
instrumental in establishing unique identifiers 
for antibodies and persuading dozens of jour- 
nals to ask authors to specifically name which 
antibodies they are using. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 549 


© 2015 Macmillan Publishers Limited. All rights reserved 


HUMAN PROTEIN ATLAS, WWW.PROTEINATLAS.ORG 


c Fz 


ANTIBODIES 


Three antibodies (green) against the same mitochondrial protein. The unexpected pattern on the right shows the third antibody binds an unintended protein. 


Information is beginning to accumulate. 
More than two dozen web portals have sprung 
up to help researchers select antibodies. Some 
collect user reviews on antibody performance 
and offer comparison tools. The Antibody 
Validation Channel, a project of the scientific 
publisher F1000, allows researchers to post 
their accounts and even request peer review. 
Biocompare has hired a content editor whose 
sole focus is to reach out to the research com- 
munity and get them to write reviews. 

Some antibody suppliers, such as St John’s 
Laboratory in London, offer researchers free 
products in exchange for testing and sharing 
the results. Antibodies-online, a market place 
for antibodies, arranges for an independent 
third party to perform validation. At Anti- 
bodypedia’s knockdown initiative, launched 
in September, life scientists can earn hundreds 
of dollars in free reagents if they submit data 
showing that gene-silencing reagents such as 
small interfering RNA or CRISPR-Cas9 elimi- 
nate an antibody signal for a given target. 

But many scientists are wary of information 
from anonymous reviews. Data supplied by 
both users and companies can be sparse, and 
some projects share data only if they confirm 
that an antibody works as expected. “Some- 
times it seems easier to hire a detective than to 
order a specific antibody,’ concludes an over- 
view of antibody portals’. 


FUTURE ASSESSMENTS 

Some researchers are developing mechanisms to 
compare antibodies directly. Aled Edwards at the 
University of Toronto, Canada, is director of the 
international Structural Genomics Consortium 
(SGC). He and his SGC colleagues used mass 
spectrometry to detect and compare the sets of 
proteins pulled down by immunoprecipitation 
with more than 1,000 antibodies’. The collabo- 
ration ran across 5 reference laboratories, took 
4 years and cost US$3 million, not counting 
in-kind donations. Ultimately, it established a 
procedure to score antibody quality and share 


quantitative information about its performance, 
specifically for ‘pull-down experiments, in 
which proteins are pulled out of solution using 
antibodies. 

Fridtjof Lund-Johansen, a proteomics 
researcher at Oslo University Hospital in Nor- 
way, is developing an ambitious bead assay 
that tests thousands of antibodies at once’. The 
plan is to separate cellular proteins into many 
different fractions, then profile the proteins in 
each fraction using two different methods. One 
is mass spectrometry and the other is a bead- 
based array with thousands of antibodies. The 
mass spectrometry data serve as a reference for 
the results obtained with antibodies. Turning the 
idea into a refined assay will take considerable 
work, Lund-Johansen admits. “It is extremely 
ambitious. It is totally crazy, but it is the only 
way to go.’ Other scientists are intrigued at the 
approach but wonder if it will predict antibody 
performance in common techniques. 

Blanket assessments of antibodies can be 
overinterpreted, says Ulf Landegren, a proteom- 
ics technology developer at Uppsala University 
in Sweden. “It is far more meaningful to discuss 
the ability of assays to detect the correct protein, 
rather than whether antibodies or other binders 
bind the right protein.’ A case in point is cross- 
reactivity, when an antibody binds proteins 
other than its specified target. Cross-reactivity 
depends not just on a particular antibody, but 
also on the complexity of a sample, the con- 
centration of the antibody and the rarity of the 
target protein. He recommends that rather than 
relying ona single antibody, researchers should 
instead test antibodies in pairs that are designed 
to bind to different parts of a target protein. 
Parts ofa sample labelled with both reagents are 
less likely to represent off-target binding. 

One problem with this approach is that it is 
hard for scientists to know if they are purchas- 
ing different antibodies. Vendors often obtain 
products from different sources and are not 
required to disclose the original manufacturer. 
As a result, researchers who want to compare 


several antibodies may end up comparing 
identical products sold by several vendors. A 
handful of companies, including Genlogica 
and One World Laboratories, both in San 
Diego, California, only sell products labelled 
by the original manufacturer and offer ‘trial 
size’ antibody batches so that researchers can 
test products side by side in their labs. 

The toughest challenge is not so much in 
antibody characterization but in persuading cell 
biologists to hold back on using antibodies until 
these are thoroughly evaluated, says Edwards, 
although he doubts that scientists will become 
savvier unless funders and publishers force the 
issue. “Right now we have an unregulated mar- 
ket, where you don't have to have any quality 
to sell your product.” In other words, he says, 
guidelines, characterization data and conscien- 
tious vendors only matter if researchers invest 
effort into selecting reagents. m 


Monya Baker writes and edits for Nature in 
San Francisco, California. 


1. Saper, C. B. & Sawchenko, P. E. J. Comp. Neurol. 
465, 161-163 (2003). 

. Bradbury, A. et al. Nature 518, 27-29 (2015). 

. Bordeaux, J. et al. BioTechniques 48, 197-209 
(2010). 

. Vasilevsky, N. A. et al. PeerJ 1, e148 (2013). 

. Pauly, D. & Hanack, K. F1000Research http://dx. 

doi.org/10.12688/f1000research.6894.1 (2015). 

Marcon, E. et al. Nature Methods 12, 725-731 

(2015). 

7. Wu, W. etal. Mol. Cell. Proteomics 8, 245-257 

(2009). 
8. Nature Methods 12, 373 (2015). 
9. Conze, T. et al. Glycobiology 20, 199-206 (2010). 


on 


a 


The Technology Feature ‘Connectomes 
make the map’ (Nature 526, 147-149; 
2015) misnamed the MultiSEM model 

and gave the wrong citation in reference 

3. MultiSEM 505 should have been Zeiss 
MultiSEM, and ref. 3 should have referred to 
Zingg, B. et al. Cell 156, 1096-1111 (2014). 


26 NOVEMBER 2015 | VOL 527 | NATURE | 551 


© 2015 Macmillan Publishers Limited. All rights reserved 


Ua SCIENCE FICTION 


BY S. J. ROSENSTEIN 


ey, there. How you doi? Yeah, 
H: too. My wife likes the 

heat, but it’s too much for 
me. It's why I come here, the best 
thing about this place is the air 
con. No one would say it’s 
the coffee. No offence. It’s 
good enough for me, ’m 
happy as long as I don't 
have to drink any of that 
fancy frou-frou stuff. 
I'll have mine black. 
My wife made me give 
up cream, she won't 
stop going on about 
my cholesterol. 

Yeah, I seen the news. 
Come on now, you don’t 
believe that garbage, do 
you? Them Claimers is as 
crazy as my mom, and she 
thinks the CIA is spying on 
her through her sprinkler. You 
don't think someone would have 
noticed if the president was being 
controlled by aliens? They’ve got to 
have about a dozen doctors and secret- 
service agents watching him the whole 
time. Don't tell me you believe they can see 
the future, too. That’s what they say. Kooks 
and slackers with nothing better to do with 
their lives, sitting in their basements work- 
ing themselves into a frenzy over nothing. 
Look here now, if I was an alien and I could 
see the future, I'd just buy me a lottery ticket 
and retire to Hawaii. 

No, you're just plain wrong. I got more 
right to an opinion than you, cos I know 
what I’m talking about. You think just ‘cos I 
don’t wear a thousand-dollar suit I’ve never 
met the great and the good? I met the presi- 
dent only two weeks ago. Yeah. Yeah, I guess 
it was exciting. Well, the company I work for, 
they make the toilets for the shuttle. The one 
the president used to go up to the alien ship, 
yeah. Well, just on the way back, after hed 
met the aliens and taken all those photos 
you seen in the papers, the toilet got blocked 
up. They didn’t know how soon they need 
to go back, so I drove up right away. Blew 
through a few stop signs too, I kind of hoped 
a cop would stop me so I could tell them I 
was on the way to fix the president's toilet. 
Didn't happen, though. 

So I got to the spaceport just as the shuttle 
was getting in, and the president got off. Id 


556 | NATURE | VOL 527 | 26 NOVEMBE 


LAIMED 


First contact. 


sort of wormed my way forward so I could 
get on as fast as possible, ‘cos they'd been 
pretty mad about the toilet and I wanted to 
show that even though we're just a tiny com- 
pany, we still give a great service. The presi- 
dent looked real tired, and as I was trying 
to get to the door he looked up and saw me, 
and took a few steps towards me, and shook 
my hand. This hand right here touched the 
president. Huh. No, I dunno what that is. 
Little scratch or something. Well, it looks a 
bit weird, but it doesn't feel infected. It’s been 
there a couple of weeks. Nothing to worry 
about. 
Anyway, I think the president thought 
I was a foreign dignitary or something, he 
started talking about the future of our world 
and strategic realignments and trade agree- 
ments, and what a great deal the aliens were 
offering us and how wonderful the future 
was gonna be. I just sort of stood there, 
didn’t really know 


> NATURE.COM what to say. One of 
Follow Futures: the spooks got to 
© @NatureFutures him pretty quick and 
EG go.nature.com/mtoodm © whispered something 


R 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


to him, and he snapped out of it OK. He 
thought it was pretty funny when he real- 
ized I was there to fix the toilet. We had 
alaugh. Well, no, I didn't. But I didn't 
want to tell him. Would be kind 
of awkward, wouldn't it, meet- 
ing the president and the first 
thing you say is you didn’t 
vote for him? 
So I went in and fixed 
the toilet. Wouldn't you 
know it, one of the 
spooks had managed 
to drop one so big and 
dense it blocked the 
system. No, I don’t 
think it was the presi- 
dent. He just doesn’t 
look like that sort of 
guy. Anyways, there 
you have it. I’ve met 
him, and so I think I have 
more right to an opinion 
than some woman on the 
TV with more hair than brain 
cells. Switch it over, won't you? 
Let’s see the game. 
You a Yankees fan? Nah, me nei- 
ther. Didn't think so, not down here, but 
it doesn’t hurt to ask. Not like I'd cheer the 
Red Sox either, but you gotta feel sorry for 
their fans. Mind you, they'll be happy this 
time. Well, I know it looks like that, but they 
ain't gonna lose. That guy coming up to the 
plate, he'll hita home run. Don’t ask me how 
I know, I just know. Same way you wake up 
in the morning sometimes and you know it’s 
gonna be the sort of day when a bird craps 
on you. Only, like, stronger. There he goes. 
See, didn’t I tell you? You should have put 
money on it. 

Well, I got to be going. Look here, no hard 
feelings, eh? Sorry I was a bit touchy, I just 
get so mad when I hear the nonsense peo- 
ple are spouting. Shake on it? Oh, I’m sorry. 
Must have scratched you with my ring or 
something. Don't worry, it'll heal right up. 
Well thanks, I’m glad to hear you say that. It 
sure makes me feel good, knowing someone 
like me can change the way someone else 
thinks. You have a good day too. m 


S. J. Rosenstein is a research scientist with 
a secret identity as a writer, although both 
incarnations wear glasses and neither are 
particularly mild mannered. She complains 
about life at alackoftheologyandgeometry. 
wordpress.com. 


ILLUSTRATION BY JACEY 


nature 


OVARIAN CANCER: BEYOND RESISTANCE 


© 


y | 


OVARIAN CANCER: BEYOND RESISTANCE 


Successfully treating the cancer requires 
overcoming the almost inevitable 
development of resistance to standard 
platinum-based therapy. 


BY DAVID HOLMES 


varian cancer is the most common cause of gynaeco- 

logical-cancer-associated death. Although the past 

40 years have seen our knowledge of the disease advance, 
translating that improved understanding into a tangible clini- 
cal benefit has been a tortuous process. The last major break- 
through in treatment came 20 years ago, with the addition of 
a taxane (paclitaxel or docetaxel are commonly used to treat 
ovarian cancer) to one of the several variants of platinum-based 
chemotherapy that remain the mainstay of treatment. Since 
then, refinements to surgery and to the timing and delivery 
of chemotherapy have produced only slight improvements in 
outcomes. In the United States, for example, 5-year survival has 
inched up from about 40% in 1985 to a still parlous 45% today. 
By comparison, 5-year survival for breast cancer stands at 90%. 

Two factors account for much of the stubbornly high mor- 
tality and morbidity associated with ovarian cancer — late 
diagnosis and treatment resistance. There are currently no 
approved methods to screen for ovarian cancer, although 
promising preliminary results released earlier this year from 
the UK Collaborative Trial of Ovarian Cancer Screening may 
begin to change that. Around 60% of women are diagnosed 
with late-stage disease that has already spread within the 
abdomen. As many as 80% of these women will respond well 
to initial treatment with platinum-based therapy, but almost 
all will experience multiple recurrences of disease, with ever 
shorter disease-free intervals. Ultimately, almost all of these 
women will die from the disease — and most will die froma 
disease that is resistant to platinum chemotherapy. 

As with most solid malignancies, resistance to platinum- 
based treatment can be intrinsic or acquired, and is brought 
about through a bewildering array of mechanisms. From 
pumps that eject the drug from the cell to promoting the 
expression of genes that enable alternative growth pathways, 
cancer cells leave no stone unturned in their bid to survive and 


Produced with support from: 


Pharma 
Mar. 


proliferate. Further complicating matters is the difficulty of 
knowing which mechanism or mechanisms are active in any 
particular person. 

The good news is that there are a huge number of experi- 
mental therapies in development, and it is hoped that these can 
be added to platinum-based chemotherapy to help deliver a 
knockout blow, or at least prolong the intervals between treat- 
ment and improve patients’ quality of life. Vaccines to activate 
the immune system against tumours, agents to interfere in 
DNA- repair pathways, and therapies that choke off the supply 
of blood to the tumour are all now in clinical trials for ovar- 
ian cancer. Because of the complexity of the disease and the 
mechanisms that underlie treatment resistance, it is unlikely 
that any one therapy will be a silver bullet. Nevertheless, there 
is a growing sense of optimism that researchers will be able to 
translate hard-won knowledge into improved outcomes for 
patients. 

Nature is pleased to acknowledge the financial support of 
Pharma Mar, S.A. in producing this Outline. As always, Nature 
retains sole responsibility for all editorial content. 


David Holmes is a science writer based in the United Kingdom. 


Nature Outlines are sponsored supplements that aim to unpack scientific and technical concepts 
through infographics, illustrations and short animations. The boundaries of sponsor involvement 
are clearly delineated in the guidelines available at http://go.nature.com/ecx76b. 
Feedback@nature.com. Copyright © 2015 Nature Publishing Group 


26 NOVEMBER 2015 | VOL 527 | NATURE | 


© 2015 Macmillan Publishers Limited. All rights reserved 


LUCY READING-IKKANDA 


$217 


O) UNE) i) OVARIAN CANCER 
>) For an animated version of this graphic visit: 
go.nature.com/ghn2pe 


Ovarian cancer is difficult to treat, largely because tumours are often found late and develop 
resistance to initial treatment: platinum -based therapy. New approaches promise to break 
through the platinum barrier. By David Holmes; illustration by Lucy Reading-Ikkanda. 


BIG PROBLEM, LITTLE PROGRESS 
BAD TIMING THETOLL OF RESISTANCE SLOW PROGRESS 


The earlier that ovarian cancer is identified, the better Of US women diagnosed with ovarian cancer, 60% The late stage at which most ovarian cancers are 

the odds are that treatment will be successful. Women have late-stage disease. Most of these initially respond diagnosed, the fact that such a high proportion become 
are not screened because current methods are not well to treatment with a combination of paclitaxel (a resistant to platinum-based chemotherapy, and the 
reliable enough to predict whether or not women have drug that interferes with cell division) and carboplatin small number of approved alternatives to platinum 

the disease. Early symptoms of ovarian cancer are often (a platinum-based drug that damages cancer-cell therapy, mean that ovarian cancer has a relatively low 
confused with irritable bowel or premenstrual syndrome, | DNA). However, more than half will relapse within five-year survival rate. In the United States, for example, 
so most people are diagnosed with late-stage disease. 18 months of diagnosis. it is just 45.6%. 


1 
60 0 of cases are diagnosed as late-stage disease ... 


The proportion of women with 
ovarian cancer who survive five years 
or more after diagnosis has changed 
little in more than a decade. The 
outlook is bleaker than for women 
with breast cancer. 


239,000 


new cases of 

ovarian cancer 
worldwide 

in 2012 


192,000 


deaths 


=BB- =pB- =B- =3B- 
=pB> =p—- =pB- => 
=BB> =pB- =pB- =3B- 
=pB- =BB-> =B- =3B- 
=BB> =pB- =B- => 
=BQ> =pB- =B- =~ 
=BB> =pB- =—B- =3B- 
=BB- =pB- =B- =3B- 
=BBo =pP- =B- => 
mpRe mpe => =pNe 
mpHe =phe =ph- =pN- 
mpQe mphe =ph- =pN- 


om a 
eererereirea 
1999 2012" 2012" 


0 : ; 
...and 25% of these women will have recurrence with a (Ovarian) (Ovarian) (Breast) 
platinum-resistant tumour within the first 12 months. 


SOURCES: GLOBOCAN.IARC.FR; SEER.CANCER.GOV 


*Age standardized estimate 


THE ROOTS OF RESISTANCE 


Most researchers agree that, in common with many cancers, a small population of platinum-resistant cancer cells exists 
in ovarian tumours before treatment and flourishes once treatment has killed their platinum-sensitive counterparts. This 
results in regrowth of the tumour, and a low probability that it will respond to further treatment with platinum-based drugs. 


The tumour is made up of platinum-sensitive Death of platinum- 
cancer cells (purple), and a small population sensitive cells 
of platinum-resistant cells (green). 


Dividing 
cells 


Platinum-resistant 
cells multiply 
uncontrollably, 
forming a resistant 


Platinum enters cancer cells where tumour. 


it binds to and damages DNA. In 
platinum-sensitive cells, this results 
in programmed cell death. 


Platinum 
removed through 
membrane pumps 


Decreased 
platinum 
uptake 


The platinum-resistent cells 

use a complex repertoire of 
mechanisms to mitigate the 
effects of platinum therapy, 
including DNA damage repair, 
decreased drug uptake, 
increased platinum removal 
and sequestration of the metal 
into lysosomes. 


Platinum 
= lysosomal 
9 sequestration 


$218 | NATURE | VOL 527 | 26 NOVEMBER 2015 


© 2015 Macmillan Pu Limited. All rights reserved 


THE NEW WAVE OF THERAPY 


From priming the immune system to fight ovarian tumours to cutting off the cancer’s blood supply, researchers 


are testing a variety of ways to overcome resistance to platinum-based chemotherapy. 


SCRAMBLING THE CODE 


Ramping up DNA-repair pathways 
is one of the ways that cancer cells 
resist the DNA-damaging effects 

of platinum. If those DNA-repair 
pathways could be dampened down 
it might be possible to resensitize 
cancer cells to platinum. There are 
several drugs in development that 
aim to do just that. PARP inhibitors 
disrupt the mechanism by which 
damaged parts of DNA are removed, 
and the drug trabectedin binds 
directly to and damages the DNA. 
Both have shown early promise. 

The drug topotecan blocks the 
action of the enzyme TOP1, which 
helps to repair DNA damage, and is 
already licensed for the treatment of 
recurrent ovarian cancer. However, its 
effect on overall survival is limited. 


DNA damage and disruption of 
DNA-repair mechanisms leads to 
cell death. 


Damaged 
A DNA 


> ~*y 
DNA repair 
mediated by 

PARP or TOP1 
is inhibited 


“ 


{I 


J SOURCE: CANCERRESEARCHUK.ORG 


IMMUNE BOOSTERS 


Priming the immune system to 
recognize and attack cancer cells 
might be an effective way of stunting 
the growth of tumours in people with 
recurrent ovarian cancer. A UK trial 
called TRIOC is testing whether the 
TroVax vaccine, which has this priming 
effect, can boost an individual’s 
anticancer immune response enough 
to slow the growth of recurrent ovarian 
tumours and delay the need for a 
second line of chemotherapy. In the 
trial, the vaccine is given to people who 
have high levels of a marker called 
CA125 in their blood, which indicates 
that a cancer may have returned. 


The vaccine-primed immune system 
releases antibodies and T cells to 
bind to antigens on the cell surface. 


Antibodies Xx 


2 


gua 


Antigen 


HORMONE THERAPY 


Similar to many breast cancers, some 
ovarian cancer cells have oestrogen 
receptors on their surface and may 
require the hormone to grow and 
spread. This has led researchers 

to test the hormone treatment 
tamoxifen, which is often used to 
treat oestrogen-receptor-positive 
breast cancers, in women with 
advanced ovarian cancer. Tamoxifen 
blocks oestrogen from reaching the 
cells and has been shown to work 
for a small proportion of women 
with recurrent cancer that does not 
respond to chemotherapy. Several 
other hormone treatments, such as 
letrozole and anastrozole, are also in 
clinical trials. 


Tamoxifen competes with oestrogen 
to bind to oestrogen receptors, 
preventing oestrogen-induced cell 
division and tumour growth. 


& Oestrogen 


Oestrogen 
receptor 


Dividing 
cells 


Tamoxifen 


@ blocks 


oestrogen 
receptor 


OVARIAN CANCER @UUUEINS 


STARVING THE TUMOUR 


Several treatments are in clinical 
trials to assess whether blocking the 
blood supply to tumours can slow 
down their recurrence. The antibody 
bevacizumab prevents the formation 
of new blood cells by inhibiting 
activity of the signalling protein VEGF, 
which is involved in the growth of 
blood vessels. The drug has already 
been approved by the US Food 

and Drug Administration and the 
European Medicines Agency for use 
in combination with chemotherapy 
for platinum-resistant relapsed 
ovarian cancer. Another drug — 
cediranib — disrupts the formation 
of blood vessels around the tumour 
by inhibiting a type of signalling 
protein called tyrosine kinase. In a 
trial called ICON6, the drug increased 
survival by three months compared 
with standard treatment for recurrent 
ovarian cancer. Several other drugs 
that block blood-vessel growth, such 
as combretastatin, pazopanib and 
trebananib, are also in clinical trials. 


Cancer cells release VEGF to 
promote the growth of blood 
vessels around the tumour. Drugs 
that disrupt the VEGF-signalling 
pathway prevent the formation of 
vessels and limit tumour growth. 


Activated VEGF 
receptor 


Drug-inhibited 
receptor 


Blood vessels 
degenerate, cutting 
off the tumour’s 
nutrient supply and 
causing it to stop 
growing or regress. 


26 NOVEMBER 2015 | VOL 527 | NATURE | 8219 


© 2015 Macmillan Publishers Limited. All rightSireserved 


= — —= 


<2 —_ ~ Pharma 


a May 


Despite the enormous efforts that 

go into cancer research, the current 
approach to drug discovery is largely 
constrained by what we already know 
about cancer and to some extent the 
imagination of the human mind. 


Unlocking the Potential of 
Marine-inspired Oncology 


PharmaMar is pursuing a different 
approach that leverages nature's 
evolution in the enormous biodiversity 
found in the world’s oceans, discovering 
organisms with unique biophysiology 
and powerful anticancer properties. 


PharmaMar’s rigorous approach to 
research is unlocking the remarkable 
therapeutic potential of marine 
ecosystems to bring new hope for 
better anticancer therapies. 


Pharma Mar, S.A. www.pharmamar.com 


PHM-Ad-Nature.indd 1 ( 11/09/2015 15:37 


